named query - Page 623 - Developer IT

How do I use a list of filenames to find a folder on my hard drive, that contains most matches of these filenames?

- by Web Master

I need a program that will use a list of file names to find a folder on my hard drive that contains the most of these filenames. Long story short I made a giant map. This map was live and got ruined. New map data files have been generated, and previous map data files have been altered. What does this mean? This means file sizes have been changed, and there will be new files that have never been in the backup folder. Some files map files could also have been generated in other projects. So there could be filenames on my computer not associated with this due to the way the files are named when created. So If I take an indidual file for example "r.-1.-1.mca" This file could show up on my hard drive 10 times. Anyway, the goal is to take 100 map files, turn them into a list, and then search the hard drive and find the folder that has the highest count of matching map file names. Can anyone figure out a way to do this? I am thinking about manually searching for every single file.

Read the article

Can I host multiple sites with one Amazon EC2 instance [duplicate]

- by user22

This question already has an answer here: Can you help me with my capacity planning? 2 answers I currently have VPS server and I pay around $75 per month and I get: 40GB HD 2Gb RAM 100GB BW 6 core cpu (but i dont use much) I have only one live website running and traffic is only max 100 user visit per day. I mostly do the my testing stuff and some of my inter sites for playing with coding. But I do need one server. I am thinking of moving to Amazon EC2 if the price diff is not so much because then I can learn some more stuff. I am thinking of getting the 3 years Heavy utilization Reserved instance because my server will be running all day and night. I tried their online caluclator with Medium Instance Heavy reserved for 3 years for EC2 it comes $31 per month(effective price) and for EBS and S3 , I think even if thats it $40 for all other stuff. I will be at no loss for what I am getting at present. Am i correct or I missed something?? Now In my current VPS I have Apache for PHP sites and MOD wsgi for python sites. I am not sure if I will be able to do all that stuff in Amazon EC2. Can I host python and PHP sites both in Amazon EC2 instance using Named Virtual Hosts and Ngnix

Read the article

VLAN across a router to give wireless access to remote sites?

- by Don

I've been looking online for this answer, but getting conflicting information. I was under the impression that you couldn't use a VLAN across a router, but maybe it's possible (according to some documentation I see online)? I was hoping someone could clear it up for me. Here's what I'm working with: We have a remote site with a handful of users. We recently gave them an access point (Cisco 1142n) for internal wireless. It's plugged into a switch and working fine (getting IPs from the same DHCP scope as the wired users are getting). Private wireless is set on VL50. At the home office we have private wireless for our internal network working and on VL50, with a test VLAN setup for VL60, which points to our DSL line for the time being. Both private and public wireless works fine internally (not crossing a router). VL50 is named the same at both sites for consistency in naming. If we wanted to give the remote site access to the public wireless (VL60), would that be possible across the routers? For more information, currently the site is connected to the home office via a T1 connection, Cisco routers on both ends. I didn't think it was possible due to the nature of VLANS being layer 2. But, I am from from an expert on this and would appreciate any instruction as to the actual truth of the matter. The end result I'm going for is, how to get our remote sites access to a public (outside) connection along with their private connection, without actually having a DSL (or similar type line) dropped at their location? Thanks in advance for your thoughts.

Read the article

BKF file corruption

- by Naitik Semwaal

I don't wanna ask anything here as I have nothing to ask. Instead of that, if I share some useful info here, would you guys mind? If not, then let me proceed. You must have heard about "Back up", the process in which we create backup copies of our crucial data into a file, called BKF (backup) file. Having a valid BKF file, provides security to our data against unwanted data loss or corruption. Whenever such a critical situation takes place, we can restore our BKF file and get our data back (but only if backed up earlier). Do you guys ever thought that why a BKF file gets corrupted? What could be the reasons which make the BKF file corrupted or inaccessible? One day while googling, I found a blog post named as: Reasons of BKF file corruption. I read it, it was very informative. In this blog, I came to know about the reasons for corruption in BKF files. I shared the blog here so that users can read it and clear their doubts of BKF file corruption. I hope this would be helpful.

Read the article

An Introduction to ASP.NET Web API

- by Rick Strahl

Microsoft recently released ASP.NET MVC 4.0 and .NET 4.5 and along with it, the brand spanking new ASP.NET Web API. Web API is an exciting new addition to the ASP.NET stack that provides a new, well-designed HTTP framework for creating REST and AJAX APIs (API is Microsoft’s new jargon for a service, in case you’re wondering). Although Web API ships and installs with ASP.NET MVC 4, you can use Web API functionality in any ASP.NET project, including WebForms, WebPages and MVC or just a Web API by itself. And you can also self-host Web API in your own applications from Console, Desktop or Service applications. If you're interested in a high level overview on what ASP.NET Web API is and how it fits into the ASP.NET stack you can check out my previous post: Where does ASP.NET Web API fit? In the following article, I'll focus on a practical, by example introduction to ASP.NET Web API. All the code discussed in this article is available in GitHub: https://github.com/RickStrahl/AspNetWebApiArticle [republished from my Code Magazine Article and updated for RTM release of ASP.NET Web API] Getting Started To start I’ll create a new empty ASP.NET application to demonstrate that Web API can work with any kind of ASP.NET project. Although you can create a new project based on the ASP.NET MVC/Web API template to quickly get up and running, I’ll take you through the manual setup process, because one common use case is to add Web API functionality to an existing ASP.NET application. This process describes the steps needed to hook up Web API to any ASP.NET 4.0 application. Start by creating an ASP.NET Empty Project. Then create a new folder in the project called Controllers. Add a Web API Controller Class Once you have any kind of ASP.NET project open, you can add a Web API Controller class to it. Web API Controllers are very similar to MVC Controller classes, but they work in any kind of project. Add a new item to this folder by using the Add New Item option in Visual Studio and choose Web API Controller Class, as shown in Figure 1. Figure 1: This is how you create a new Controller Class in Visual Studio Make sure that the name of the controller class includes Controller at the end of it, which is required in order for Web API routing to find it. Here, the name for the class is AlbumApiController. For this example, I’ll use a Music Album model to demonstrate basic behavior of Web API. The model consists of albums and related songs where an album has properties like Name, Artist and YearReleased and a list of songs with a SongName and SongLength as well as an AlbumId that links it to the album. You can find the code for the model (and the rest of these samples) on Github. To add the file manually, create a new folder called Model, and add a new class Album.cs and copy the code into it. There’s a static AlbumData class with a static CreateSampleAlbumData() method that creates a short list of albums on a static .Current that I’ll use for the examples. Before we look at what goes into the controller class though, let’s hook up routing so we can access this new controller. Hooking up Routing in Global.asax To start, I need to perform the one required configuration task in order for Web API to work: I need to configure routing to the controller. Like MVC, Web API uses routing to provide clean, extension-less URLs to controller methods. Using an extension method to ASP.NET’s static RouteTable class, you can use the MapHttpRoute() (in the System.Web.Http namespace) method to hook-up the routing during Application_Start in global.asax.cs shown in Listing 1.using System; using System.Web.Routing; using System.Web.Http; namespace AspNetWebApi { public class Global : System.Web.HttpApplication { protected void Application_Start(object sender, EventArgs e) { RouteTable.Routes.MapHttpRoute( name: "AlbumVerbs", routeTemplate: "albums/{title}", defaults: new { symbol = RouteParameter.Optional, controller="AlbumApi" } ); } } } This route configures Web API to direct URLs that start with an albums folder to the AlbumApiController class. Routing in ASP.NET is used to create extensionless URLs and allows you to map segments of the URL to specific Route Value parameters. A route parameter, with a name inside curly brackets like {name}, is mapped to parameters on the controller methods. Route parameters can be optional, and there are two special route parameters – controller and action – that determine the controller to call and the method to activate respectively. HTTP Verb Routing Routing in Web API can route requests by HTTP Verb in addition to standard {controller},{action} routing. For the first examples, I use HTTP Verb routing, as shown Listing 1. Notice that the route I’ve defined does not include an {action} route value or action value in the defaults. Rather, Web API can use the HTTP Verb in this route to determine the method to call the controller, and a GET request maps to any method that starts with Get. So methods called Get() or GetAlbums() are matched by a GET request and a POST request maps to a Post() or PostAlbum(). Web API matches a method by name and parameter signature to match a route, query string or POST values. In lieu of the method name, the [HttpGet,HttpPost,HttpPut,HttpDelete, etc] attributes can also be used to designate the accepted verbs explicitly if you don’t want to follow the verb naming conventions. Although HTTP Verb routing is a good practice for REST style resource APIs, it’s not required and you can still use more traditional routes with an explicit {action} route parameter. When {action} is supplied, the HTTP verb routing is ignored. I’ll talk more about alternate routes later. When you’re finished with initial creation of files, your project should look like Figure 2. Figure 2: The initial project has the new API Controller Album model Creating a small Album Model Now it’s time to create some controller methods to serve data. For these examples, I’ll use a very simple Album and Songs model to play with, as shown in Listing 2. public class Song { public string AlbumId { get; set; } [Required, StringLength(80)] public string SongName { get; set; } [StringLength(5)] public string SongLength { get; set; } } public class Album { public string Id { get; set; } [Required, StringLength(80)] public string AlbumName { get; set; } [StringLength(80)] public string Artist { get; set; } public int YearReleased { get; set; } public DateTime Entered { get; set; } [StringLength(150)] public string AlbumImageUrl { get; set; } [StringLength(200)] public string AmazonUrl { get; set; } public virtual List<Song> Songs { get; set; } public Album() { Songs = new List<Song>(); Entered = DateTime.Now; // Poor man's unique Id off GUID hash Id = Guid.NewGuid().GetHashCode().ToString("x"); } public void AddSong(string songName, string songLength = null) { this.Songs.Add(new Song() { AlbumId = this.Id, SongName = songName, SongLength = songLength }); } } Once the model has been created, I also added an AlbumData class that generates some static data in memory that is loaded onto a static .Current member. The signature of this class looks like this and that's what I'll access to retrieve the base data:public static class AlbumData { // sample data - static list public static List<Album> Current = CreateSampleAlbumData(); /// <summary> /// Create some sample data /// </summary> /// <returns></returns> public static List<Album> CreateSampleAlbumData() { … }} You can check out the full code for the data generation online. Creating an AlbumApiController Web API shares many concepts of ASP.NET MVC, and the implementation of your API logic is done by implementing a subclass of the System.Web.Http.ApiController class. Each public method in the implemented controller is a potential endpoint for the HTTP API, as long as a matching route can be found to invoke it. The class name you create should end in Controller, which is how Web API matches the controller route value to figure out which class to invoke. Inside the controller you can implement methods that take standard .NET input parameters and return .NET values as results. Web API’s binding tries to match POST data, route values, form values or query string values to your parameters. Because the controller is configured for HTTP Verb based routing (no {action} parameter in the route), any methods that start with Getxxxx() are called by an HTTP GET operation. You can have multiple methods that match each HTTP Verb as long as the parameter signatures are different and can be matched by Web API. In Listing 3, I create an AlbumApiController with two methods to retrieve a list of albums and a single album by its title .public class AlbumApiController : ApiController { public IEnumerable<Album> GetAlbums() { var albums = AlbumData.Current.OrderBy(alb => alb.Artist); return albums; } public Album GetAlbum(string title) { var album = AlbumData.Current .SingleOrDefault(alb => alb.AlbumName.Contains(title)); return album; }} To access the first two requests, you can use the following URLs in your browser: http://localhost/aspnetWebApi/albumshttp://localhost/aspnetWebApi/albums/Dirty%20Deeds Note that you’re not specifying the actions of GetAlbum or GetAlbums in these URLs. Instead Web API’s routing uses HTTP GET verb to route to these methods that start with Getxxx() with the first mapping to the parameterless GetAlbums() method and the latter to the GetAlbum(title) method that receives the title parameter mapped as optional in the route. Content Negotiation When you access any of the URLs above from a browser, you get either an XML or JSON result returned back. The album list result for Chrome 17 and Internet Explorer 9 is shown Figure 3. Figure 3: Web API responses can vary depending on the browser used, demonstrating Content Negotiation in action as these two browsers send different HTTP Accept headers. Notice that the results are not the same: Chrome returns an XML response and IE9 returns a JSON response. Whoa, what’s going on here? Shouldn’t we see the same result in both browsers? Actually, no. Web API determines what type of content to return based on Accept headers. HTTP clients, like browsers, use Accept headers to specify what kind of content they’d like to see returned. Browsers generally ask for HTML first, followed by a few additional content types. Chrome (and most other major browsers) ask for: Accept: text/html, application/xhtml+xml,application/xml; q=0.9,*/*;q=0.8 IE9 asks for: Accept: text/html, application/xhtml+xml, */* Note that Chrome’s Accept header includes application/xml, which Web API finds in its list of supported media types and returns an XML response. IE9 does not include an Accept header type that works on Web API by default, and so it returns the default format, which is JSON. This is an important and very useful feature that was missing from any previous Microsoft REST tools: Web API automatically switches output formats based on HTTP Accept headers. Nowhere in the server code above do you have to explicitly specify the output format. Rather, Web API determines what format the client is requesting based on the Accept headers and automatically returns the result based on the available formatters. This means that a single method can handle both XML and JSON results.. Using this simple approach makes it very easy to create a single controller method that can return JSON, XML, ATOM or even OData feeds by providing the appropriate Accept header from the client. By default you don’t have to worry about the output format in your code. Note that you can still specify an explicit output format if you choose, either globally by overriding the installed formatters, or individually by returning a lower level HttpResponseMessage instance and setting the formatter explicitly. More on that in a minute. Along the same lines, any content sent to the server via POST/PUT is parsed by Web API based on the HTTP Content-type of the data sent. The same formats allowed for output are also allowed on input. Again, you don’t have to do anything in your code – Web API automatically performs the deserialization from the content. Accessing Web API JSON Data with jQuery A very common scenario for Web API endpoints is to retrieve data for AJAX calls from the Web browser. Because JSON is the default format for Web API, it’s easy to access data from the server using jQuery and its getJSON() method. This example receives the albums array from GetAlbums() and databinds it into the page using knockout.js.$.getJSON("albums/", function (albums) { // make knockout template visible $(".album").show(); // create view object and attach array var view = { albums: albums }; ko.applyBindings(view); }); Figure 4 shows this and the next example’s HTML output. You can check out the complete HTML and script code at http://goo.gl/Ix33C (.html) and http://goo.gl/tETlg (.js). Figu Figure 4: The Album Display sample uses JSON data loaded from Web API. The result from the getJSON() call is a JavaScript object of the server result, which comes back as a JavaScript array. In the code, I use knockout.js to bind this array into the UI, which as you can see, requires very little code, instead using knockout’s data-bind attributes to bind server data to the UI. Of course, this is just one way to use the data – it’s entirely up to you to decide what to do with the data in your client code. Along the same lines, I can retrieve a single album to display when the user clicks on an album. The response returns the album information and a child array with all the songs. The code to do this is very similar to the last example where we pulled the albums array:$(".albumlink").live("click", function () { var id = $(this).data("id"); // title $.getJSON("albums/" + id, function (album) { ko.applyBindings(album, $("#divAlbumDialog")[0]); $("#divAlbumDialog").show(); }); }); Here the URL looks like this: /albums/Dirty%20Deeds, where the title is the ID captured from the clicked element’s data ID attribute. Explicitly Overriding Output Format When Web API automatically converts output using content negotiation, it does so by matching Accept header media types to the GlobalConfiguration.Configuration.Formatters and the SupportedMediaTypes of each individual formatter. You can add and remove formatters to globally affect what formats are available and it’s easy to create and plug in custom formatters.The example project includes a JSONP formatter that can be plugged in to provide JSONP support for requests that have a callback= querystring parameter. Adding, removing or replacing formatters is a global option you can use to manipulate content. It’s beyond the scope of this introduction to show how it works, but you can review the sample code or check out my blog entry on the subject (http://goo.gl/UAzaR). If automatic processing is not desirable in a particular Controller method, you can override the response output explicitly by returning an HttpResponseMessage instance. HttpResponseMessage is similar to ActionResult in ASP.NET MVC in that it’s a common way to return an abstract result message that contains content. HttpResponseMessage s parsed by the Web API framework using standard interfaces to retrieve the response data, status code, headers and so on[MS2] . Web API turns every response – including those Controller methods that return static results – into HttpResponseMessage instances. Explicitly returning an HttpResponseMessage instance gives you full control over the output and lets you mostly bypass WebAPI’s post-processing of the HTTP response on your behalf. HttpResponseMessage allows you to customize the response in great detail. Web API’s attention to detail in the HTTP spec really shows; many HTTP options are exposed as properties and enumerations with detailed IntelliSense comments. Even if you’re new to building REST-based interfaces, the API guides you in the right direction for returning valid responses and response codes. For example, assume that I always want to return JSON from the GetAlbums() controller method and ignore the default media type content negotiation. To do this, I can adjust the output format and headers as shown in Listing 4.public HttpResponseMessage GetAlbums() { var albums = AlbumData.Current.OrderBy(alb => alb.Artist); // Create a new HttpResponse with Json Formatter explicitly var resp = new HttpResponseMessage(HttpStatusCode.OK); resp.Content = new ObjectContent<IEnumerable<Album>>( albums, new JsonMediaTypeFormatter()); // Get Default Formatter based on Content Negotiation //var resp = Request.CreateResponse<IEnumerable<Album>>(HttpStatusCode.OK, albums); resp.Headers.ConnectionClose = true; resp.Headers.CacheControl = new CacheControlHeaderValue(); resp.Headers.CacheControl.Public = true; return resp; } This example returns the same IEnumerable<Album> value, but it wraps the response into an HttpResponseMessage so you can control the entire HTTP message result including the headers, formatter and status code. In Listing 4, I explicitly specify the formatter using the JsonMediaTypeFormatter to always force the content to JSON. If you prefer to use the default content negotiation with HttpResponseMessage results, you can create the Response instance using the Request.CreateResponse method:var resp = Request.CreateResponse<IEnumerable<Album>>(HttpStatusCode.OK, albums); This provides you an HttpResponse object that's pre-configured with the default formatter based on Content Negotiation. Once you have an HttpResponse object you can easily control most HTTP aspects on this object. What's sweet here is that there are many more detailed properties on HttpResponse than the core ASP.NET Response object, with most options being explicitly configurable with enumerations that make it easy to pick the right headers and response codes from a list of valid codes. It makes HTTP features available much more discoverable even for non-hardcore REST/HTTP geeks. Non-Serialized Results The output returned doesn’t have to be a serialized value but can also be raw data, like strings, binary data or streams. You can use the HttpResponseMessage.Content object to set a number of common Content classes. Listing 5 shows how to return a binary image using the ByteArrayContent class from a Controller method. [HttpGet] public HttpResponseMessage AlbumArt(string title) { var album = AlbumData.Current.FirstOrDefault(abl => abl.AlbumName.StartsWith(title)); if (album == null) { var resp = Request.CreateResponse<ApiMessageError>( HttpStatusCode.NotFound, new ApiMessageError("Album not found")); return resp; } // kinda silly - we would normally serve this directly // but hey - it's a demo. var http = new WebClient(); var imageData = http.DownloadData(album.AlbumImageUrl); // create response and return var result = new HttpResponseMessage(HttpStatusCode.OK); result.Content = new ByteArrayContent(imageData); result.Content.Headers.ContentType = new MediaTypeHeaderValue("image/jpeg"); return result; } The image retrieval from Amazon is contrived, but it shows how to return binary data using ByteArrayContent. It also demonstrates that you can easily return multiple types of content from a single controller method, which is actually quite common. If an error occurs - such as a resource can’t be found or a validation error – you can return an error response to the client that’s very specific to the error. In GetAlbumArt(), if the album can’t be found, we want to return a 404 Not Found status (and realistically no error, as it’s an image). Note that if you are not using HTTP Verb-based routing or not accessing a method that starts with Get/Post etc., you have to specify one or more HTTP Verb attributes on the method explicitly. Here, I used the [HttpGet] attribute to serve the image. Another option to handle the error could be to return a fixed placeholder image if no album could be matched or the album doesn’t have an image. When returning an error code, you can also return a strongly typed response to the client. For example, you can set the 404 status code and also return a custom error object (ApiMessageError is a class I defined) like this:return Request.CreateResponse<ApiMessageError>( HttpStatusCode.NotFound, new ApiMessageError("Album not found") ); If the album can be found, the image will be returned. The image is downloaded into a byte[] array, and then assigned to the result’s Content property. I created a new ByteArrayContent instance and assigned the image’s bytes and the content type so that it displays properly in the browser. There are other content classes available: StringContent, StreamContent, ByteArrayContent, MultipartContent, and ObjectContent are at your disposal to return just about any kind of content. You can create your own Content classes if you frequently return custom types and handle the default formatter assignments that should be used to send the data out . Although HttpResponseMessage results require more code than returning a plain .NET value from a method, it allows much more control over the actual HTTP processing than automatic processing. It also makes it much easier to test your controller methods as you get a response object that you can check for specific status codes and output messages rather than just a result value. Routing Again Ok, let’s get back to the image example. Using the original routing we have setup using HTTP Verb routing there's no good way to serve the image. In order to return my album art image I’d like to use a URL like this: http://localhost/aspnetWebApi/albums/Dirty%20Deeds/image In order to create a URL like this, I have to create a new Controller because my earlier routes pointed to the AlbumApiController using HTTP Verb routing. HTTP Verb based routing is great for representing a single set of resources such as albums. You can map operations like add, delete, update and read easily using HTTP Verbs. But you cannot mix action based routing into a an HTTP Verb routing controller - you can only map HTTP Verbs and each method has to be unique based on parameter signature. You can't have multiple GET operations to methods with the same signature. So GetImage(string id) and GetAlbum(string title) are in conflict in an HTTP GET routing scenario. In fact, I was unable to make the above Image URL work with any combination of HTTP Verb plus Custom routing using the single Albums controller. There are number of ways around this, but all involve additional controllers. Personally, I think it’s easier to use explicit Action routing and then add custom routes if you need to simplify your URLs further. So in order to accommodate some of the other examples, I created another controller – AlbumRpcApiController – to handle all requests that are explicitly routed via actions (/albums/rpc/AlbumArt) or are custom routed with explicit routes defined in the HttpConfiguration. I added the AlbumArt() method to this new AlbumRpcApiController class. For the image URL to work with the new AlbumRpcApiController, you need a custom route placed before the default route from Listing 1.RouteTable.Routes.MapHttpRoute( name: "AlbumRpcApiAction", routeTemplate: "albums/rpc/{action}/{title}", defaults: new { title = RouteParameter.Optional, controller = "AlbumRpcApi", action = "GetAblums" } ); Now I can use either of the following URLs to access the image: Custom route: (/albums/rpc/{title}/image)http://localhost/aspnetWebApi/albums/PowerAge/image Action route: (/albums/rpc/action/{title})http://localhost/aspnetWebAPI/albums/rpc/albumart/PowerAge Sending Data to the Server To send data to the server and add a new album, you can use an HTTP POST operation. Since I’m using HTTP Verb-based routing in the original AlbumApiController, I can implement a method called PostAlbum()to accept a new album from the client. Listing 6 shows the Web API code to add a new album.public HttpResponseMessage PostAlbum(Album album) { if (!this.ModelState.IsValid) { // my custom error class var error = new ApiMessageError() { message = "Model is invalid" }; // add errors into our client error model for client foreach (var prop in ModelState.Values) { var modelError = prop.Errors.FirstOrDefault(); if (!string.IsNullOrEmpty(modelError.ErrorMessage)) error.errors.Add(modelError.ErrorMessage); else error.errors.Add(modelError.Exception.Message); } return Request.CreateResponse<ApiMessageError>(HttpStatusCode.Conflict, error); } // update song id which isn't provided foreach (var song in album.Songs) song.AlbumId = album.Id; // see if album exists already var matchedAlbum = AlbumData.Current .SingleOrDefault(alb => alb.Id == album.Id || alb.AlbumName == album.AlbumName); if (matchedAlbum == null) AlbumData.Current.Add(album); else matchedAlbum = album; // return a string to show that the value got here var resp = Request.CreateResponse(HttpStatusCode.OK, string.Empty); resp.Content = new StringContent(album.AlbumName + " " + album.Entered.ToString(), Encoding.UTF8, "text/plain"); return resp; } The PostAlbum() method receives an album parameter, which is automatically deserialized from the POST buffer the client sent. The data passed from the client can be either XML or JSON. Web API automatically figures out what format it needs to deserialize based on the content type and binds the content to the album object. Web API uses model binding to bind the request content to the parameter(s) of controller methods. Like MVC you can check the model by looking at ModelState.IsValid. If it’s not valid, you can run through the ModelState.Values collection and check each binding for errors. Here I collect the error messages into a string array that gets passed back to the client via the result ApiErrorMessage object. When a binding error occurs, you’ll want to return an HTTP error response and it’s best to do that with an HttpResponseMessage result. In Listing 6, I used a custom error class that holds a message and an array of detailed error messages for each binding error. I used this object as the content to return to the client along with my Conflict HTTP Status Code response. If binding succeeds, the example returns a string with the name and date entered to demonstrate that you captured the data. Normally, a method like this should return a Boolean or no response at all (HttpStatusCode.NoConent). The sample uses a simple static list to hold albums, so once you’ve added the album using the Post operation, you can hit the /albums/ URL to see that the new album was added. The client jQuery code to call the POST operation from the client with jQuery is shown in Listing 7. var id = new Date().getTime().toString(); var album = { "Id": id, "AlbumName": "Power Age", "Artist": "AC/DC", "YearReleased": 1977, "Entered": "2002-03-11T18:24:43.5580794-10:00", "AlbumImageUrl": http://ecx.images-amazon.com/images/…, "AmazonUrl": http://www.amazon.com/…, "Songs": [ { "SongName": "Rock 'n Roll Damnation", "SongLength": 3.12}, { "SongName": "Downpayment Blues", "SongLength": 4.22 }, { "SongName": "Riff Raff", "SongLength": 2.42 } ] } $.ajax( { url: "albums/", type: "POST", contentType: "application/json", data: JSON.stringify(album), processData: false, beforeSend: function (xhr) { // not required since JSON is default output xhr.setRequestHeader("Accept", "application/json"); }, success: function (result) { // reload list of albums page.loadAlbums(); }, error: function (xhr, status, p3, p4) { var err = "Error"; if (xhr.responseText && xhr.responseText[0] == "{") err = JSON.parse(xhr.responseText).message; alert(err); } }); The code in Listing 7 creates an album object in JavaScript to match the structure of the .NET Album class. This object is passed to the $.ajax() function to send to the server as POST. The data is turned into JSON and the content type set to application/json so that the server knows what to convert when deserializing in the Album instance. The jQuery code hooks up success and failure events. Success returns the result data, which is a string that’s echoed back with an alert box. If an error occurs, jQuery returns the XHR instance and status code. You can check the XHR to see if a JSON object is embedded and if it is, you can extract it by de-serializing it and accessing the .message property. REST standards suggest that updates to existing resources should use PUT operations. REST standards aside, I’m not a big fan of separating out inserts and updates so I tend to have a single method that handles both. But if you want to follow REST suggestions, you can create a PUT method that handles updates by forwarding the PUT operation to the POST method:public HttpResponseMessage PutAlbum(Album album) { return PostAlbum(album); } To make the corresponding $.ajax() call, all you have to change from Listing 7 is the type: from POST to PUT. Model Binding with UrlEncoded POST Variables In the example in Listing 7 I used JSON objects to post a serialized object to a server method that accepted an strongly typed object with the same structure, which is a common way to send data to the server. However, Web API supports a number of different ways that data can be received by server methods. For example, another common way is to use plain UrlEncoded POST values to send to the server. Web API supports Model Binding that works similar (but not the same) as MVC's model binding where POST variables are mapped to properties of object parameters of the target method. This is actually quite common for AJAX calls that want to avoid serialization and the potential requirement of a JSON parser on older browsers. For example, using jQUery you might use the $.post() method to send a new album to the server (albeit one without songs) using code like the following:$.post("albums/",{AlbumName: "Dirty Deeds", YearReleased: 1976 … },albumPostCallback); Although the code looks very similar to the client code we used before passing JSON, here the data passed is URL encoded values (AlbumName=Dirty+Deeds&YearReleased=1976 etc.). Web API then takes this POST data and maps each of the POST values to the properties of the Album object in the method's parameter. Although the client code is different the server can both handle the JSON object, or the UrlEncoded POST values. Dynamic Access to POST Data There are also a few options available to dynamically access POST data, if you know what type of data you're dealing with. If you have POST UrlEncoded values, you can dynamically using a FormsDataCollection:[HttpPost] public string PostAlbum(FormDataCollection form) { return string.Format("{0} - released {1}", form.Get("AlbumName"),form.Get("RearReleased")); } The FormDataCollection is a very simple object, that essentially provides the same functionality as Request.Form[] in ASP.NET. Request.Form[] still works if you're running hosted in an ASP.NET application. However as a general rule, while ASP.NET's functionality is always available when running Web API hosted inside of an ASP.NET application, using the built in classes specific to Web API makes it possible to run Web API applications in a self hosted environment outside of ASP.NET. If your client is sending JSON to your server, and you don't want to map the JSON to a strongly typed object because you only want to retrieve a few simple values, you can also accept a JObject parameter in your API methods:[HttpPost] public string PostAlbum(JObject jsonData) { dynamic json = jsonData; JObject jalbum = json.Album; JObject juser = json.User; string token = json.UserToken; var album = jalbum.ToObject<Album>(); var user = juser.ToObject<User>(); return String.Format("{0} {1} {2}", album.AlbumName, user.Name, token); } There quite a few options available to you to receive data with Web API, which gives you more choices for the right tool for the job. Unfortunately one shortcoming of Web API is that POST data is always mapped to a single parameter. This means you can't pass multiple POST parameters to methods that receive POST data. It's possible to accept multiple parameters, but only one can map to the POST content - the others have to come from the query string or route values. I have a couple of Blog POSTs that explain what works and what doesn't here: Passing multiple POST parameters to Web API Controller Methods Mapping UrlEncoded POST Values in ASP.NET Web API Handling Delete Operations Finally, to round out the server API code of the album example we've been discussin, here’s the DELETE verb controller method that allows removal of an album by its title:public HttpResponseMessage DeleteAlbum(string title) { var matchedAlbum = AlbumData.Current.Where(alb => alb.AlbumName == title) .SingleOrDefault(); if (matchedAlbum == null) return new HttpResponseMessage(HttpStatusCode.NotFound); AlbumData.Current.Remove(matchedAlbum); return new HttpResponseMessage(HttpStatusCode.NoContent); } To call this action method using jQuery, you can use:$(".removeimage").live("click", function () { var $el = $(this).parent(".album"); var txt = $el.find("a").text(); $.ajax({ url: "albums/" + encodeURIComponent(txt), type: "Delete", success: function (result) { $el.fadeOut().remove(); }, error: jqError }); } Note the use of the DELETE verb in the $.ajax() call, which routes to DeleteAlbum on the server. DELETE is a non-content operation, so you supply a resource ID (the title) via route value or the querystring. Routing Conflicts In all requests with the exception of the AlbumArt image example shown so far, I used HTTP Verb routing that I set up in Listing 1. HTTP Verb Routing is a recommendation that is in line with typical REST access to HTTP resources. However, it takes quite a bit of effort to create REST-compliant API implementations based only on HTTP Verb routing only. You saw one example that didn’t really fit – the return of an image where I created a custom route albums/{title}/image that required creation of a second controller and a custom route to work. HTTP Verb routing to a controller does not mix with custom or action routing to the same controller because of the limited mapping of HTTP verbs imposed by HTTP Verb routing. To understand some of the problems with verb routing, let’s look at another example. Let’s say you create a GetSortableAlbums() method like this and add it to the original AlbumApiController accessed via HTTP Verb routing:[HttpGet] public IQueryable<Album> SortableAlbums() { var albums = AlbumData.Current; // generally should be done only on actual queryable results (EF etc.) // Done here because we're running with a static list but otherwise might be slow return albums.AsQueryable(); } If you compile this code and try to now access the /albums/ link, you get an error: Multiple Actions were found that match the request. HTTP Verb routing only allows access to one GET operation per parameter/route value match. If more than one method exists with the same parameter signature, it doesn’t work. As I mentioned earlier for the image display, the only solution to get this method to work is to throw it into another controller. Because I already set up the AlbumRpcApiController I can add the method there. First, I should rename the method to SortableAlbums() so I’m not using a Get prefix for the method. This also makes the action parameter look cleaner in the URL - it looks less like a method and more like a noun. I can then create a new route that handles direct-action mapping:RouteTable.Routes.MapHttpRoute( name: "AlbumRpcApiAction", routeTemplate: "albums/rpc/{action}/{title}", defaults: new { title = RouteParameter.Optional, controller = "AlbumRpcApi", action = "GetAblums" } ); As I am explicitly adding a route segment – rpc – into the route template, I can now reference explicit methods in the Web API controller using URLs like this: http://localhost/AspNetWebApi/rpc/SortableAlbums Error Handling I’ve already done some minimal error handling in the examples. For example in Listing 6, I detected some known-error scenarios like model validation failing or a resource not being found and returning an appropriate HttpResponseMessage result. But what happens if your code just blows up or causes an exception? If you have a controller method, like this:[HttpGet] public void ThrowException() { throw new UnauthorizedAccessException("Unauthorized Access Sucka"); } You can call it with this: http://localhost/AspNetWebApi/albums/rpc/ThrowException The default exception handling displays a 500-status response with the serialized exception on the local computer only. When you connect from a remote computer, Web API throws back a 500 HTTP Error with no data returned (IIS then adds its HTML error page). The behavior is configurable in the GlobalConfiguration:GlobalConfiguration .Configuration .IncludeErrorDetailPolicy = IncludeErrorDetailPolicy.Never; If you want more control over your error responses sent from code, you can throw explicit error responses yourself using HttpResponseException. When you throw an HttpResponseException the response parameter is used to generate the output for the Controller action. [HttpGet] public void ThrowError() { var resp = Request.CreateResponse<ApiMessageError>( HttpStatusCode.BadRequest, new ApiMessageError("Your code stinks!")); throw new HttpResponseException(resp); } Throwing an HttpResponseException stops the processing of the controller method and immediately returns the response you passed to the exception. Unlike other Exceptions fired inside of WebAPI, HttpResponseException bypasses the Exception Filters installed and instead just outputs the response you provide. In this case, the serialized ApiMessageError result string is returned in the default serialization format – XML or JSON. You can pass any content to HttpResponseMessage, which includes creating your own exception objects and consistently returning error messages to the client. Here’s a small helper method on the controller that you might use to send exception info back to the client consistently:private void ThrowSafeException(string message, HttpStatusCode statusCode = HttpStatusCode.BadRequest) { var errResponse = Request.CreateResponse<ApiMessageError>(statusCode, new ApiMessageError() { message = message }); throw new HttpResponseException(errResponse); } You can then use it to output any captured errors from code:[HttpGet] public void ThrowErrorSafe() { try { List<string> list = null; list.Add("Rick"); } catch (Exception ex) { ThrowSafeException(ex.Message); } } Exception Filters Another more global solution is to create an Exception Filter. Filters in Web API provide the ability to pre- and post-process controller method operations. An exception filter looks at all exceptions fired and then optionally creates an HttpResponseMessage result. Listing 8 shows an example of a basic Exception filter implementation.public class UnhandledExceptionFilter : ExceptionFilterAttribute { public override void OnException(HttpActionExecutedContext context) { HttpStatusCode status = HttpStatusCode.InternalServerError; var exType = context.Exception.GetType(); if (exType == typeof(UnauthorizedAccessException)) status = HttpStatusCode.Unauthorized; else if (exType == typeof(ArgumentException)) status = HttpStatusCode.NotFound; var apiError = new ApiMessageError() { message = context.Exception.Message }; // create a new response and attach our ApiError object // which now gets returned on ANY exception result var errorResponse = context.Request.CreateResponse<ApiMessageError>(status, apiError); context.Response = errorResponse; base.OnException(context); } } Exception Filter Attributes can be assigned to an ApiController class like this:[UnhandledExceptionFilter] public class AlbumRpcApiController : ApiController or you can globally assign it to all controllers by adding it to the HTTP Configuration's Filters collection:GlobalConfiguration.Configuration.Filters.Add(new UnhandledExceptionFilter()); The latter is a great way to get global error trapping so that all errors (short of hard IIS errors and explicit HttpResponseException errors) return a valid error response that includes error information in the form of a known-error object. Using a filter like this allows you to throw an exception as you normally would and have your filter create a response in the appropriate output format that the client expects. For example, an AJAX application can on failure expect to see a JSON error result that corresponds to the real error that occurred rather than a 500 error along with HTML error page that IIS throws up. You can even create some custom exceptions so you can differentiate your own exceptions from unhandled system exceptions - you often don't want to display error information from 'unknown' exceptions as they may contain sensitive system information or info that's not generally useful to users of your application/site. This is just one example of how ASP.NET Web API is configurable and extensible. Exception filters are just one example of how you can plug-in into the Web API request flow to modify output. Many more hooks exist and I’ll take a closer look at extensibility in Part 2 of this article in the future. Summary Web API is a big improvement over previous Microsoft REST and AJAX toolkits. The key features to its usefulness are its ease of use with simple controller based logic, familiar MVC-style routing, low configuration impact, extensibility at all levels and tight attention to exposing and making HTTP semantics easily discoverable and easy to use. Although none of the concepts used in Web API are new or radical, Web API combines the best of previous platforms into a single framework that’s highly functional, easy to work with, and extensible to boot. I think that Microsoft has hit a home run with Web API. Related Resources Where does ASP.NET Web API fit? Sample Source Code on GitHub Passing multiple POST parameters to Web API Controller Methods Mapping UrlEncoded POST Values in ASP.NET Web API Creating a JSONP Formatter for ASP.NET Web API Removing the XML Formatter from ASP.NET Web API Applications© Rick Strahl, West Wind Technologies, 2005-2012Posted in Web Api Tweet !function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0];if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src="//platform.twitter.com/widgets.js";fjs.parentNode.insertBefore(js,fjs);}}(document,"script","twitter-wjs"); (function() { var po = document.createElement('script'); po.type = 'text/javascript'; po.async = true; po.src = 'https://apis.google.com/js/plusone.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(po, s); })();

Read the article

Analytic functions – they’re not aggregates

- by Rob Farley

SQL 2012 brings us a bunch of new analytic functions, together with enhancements to the OVER clause. People who have known me over the years will remember that I’m a big fan of the OVER clause and the types of things that it brings us when applied to aggregate functions, as well as the ranking functions that it enables. The OVER clause was introduced in SQL Server 2005, and remained frustratingly unchanged until SQL Server 2012. This post is going to look at a particular aspect of the analytic functions though (not the enhancements to the OVER clause). When I give presentations about the analytic functions around Australia as part of the tour of SQL Saturdays (starting in Brisbane this Thursday), and in Chicago next month, I’ll make sure it’s sufficiently well described. But for this post – I’m going to skip that and assume you get it. The analytic functions introduced in SQL 2012 seem to come in pairs – FIRST_VALUE and LAST_VALUE, LAG and LEAD, CUME_DIST and PERCENT_RANK, PERCENTILE_CONT and PERCENTILE_DISC. Perhaps frustratingly, they take slightly different forms as well. The ones I want to look at now are FIRST_VALUE and LAST_VALUE, and PERCENTILE_CONT and PERCENTILE_DISC. The reason I’m pulling this ones out is that they always produce the same result within their partitions (if you’re applying them to the whole partition). Consider the following query: SELECT YEAR(OrderDate), FIRST_VALUE(TotalDue) OVER (PARTITION BY YEAR(OrderDate) ORDER BY OrderDate, SalesOrderID RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING), LAST_VALUE(TotalDue) OVER (PARTITION BY YEAR(OrderDate) ORDER BY OrderDate, SalesOrderID RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING), PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY TotalDue) OVER (PARTITION BY YEAR(OrderDate)), PERCENTILE_DISC(0.95) WITHIN GROUP (ORDER BY TotalDue) OVER (PARTITION BY YEAR(OrderDate)) FROM Sales.SalesOrderHeader ; This is designed to get the TotalDue for the first order of the year, the last order of the year, and also the 95% percentile, using both the continuous and discrete methods (‘discrete’ means it picks the closest one from the values available – ‘continuous’ means it will happily use something between, similar to what you would do for a traditional median of four values). I’m sure you can imagine the results – a different value for each field, but within each year, all the rows the same. Notice that I’m not grouping by the year. Nor am I filtering. This query gives us a result for every row in the SalesOrderHeader table – 31465 in this case (using the original AdventureWorks that dates back to the SQL 2005 days). The RANGE BETWEEN bit in FIRST_VALUE and LAST_VALUE is needed to make sure that we’re considering all the rows available. If we don’t specify that, it assumes we only mean “RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW”, which means that LAST_VALUE ends up being the row we’re looking at. At this point you might think about other environments such as Access or Reporting Services, and remember aggregate functions like FIRST. We really should be able to do something like: SELECT YEAR(OrderDate), FIRST_VALUE(TotalDue) OVER (PARTITION BY YEAR(OrderDate) ORDER BY OrderDate, SalesOrderID RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) FROM Sales.SalesOrderHeader GROUP BY YEAR(OrderDate) ; But you can’t. You get that age-old error: Msg 8120, Level 16, State 1, Line 5 Column 'Sales.SalesOrderHeader.OrderDate' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause. Msg 8120, Level 16, State 1, Line 5 Column 'Sales.SalesOrderHeader.SalesOrderID' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause. Hmm. You see, FIRST_VALUE isn’t an aggregate function. None of these analytic functions are. There are too many things involved for SQL to realise that the values produced might be identical within the group. Furthermore, you can’t even surround it in a MAX. Then you get a different error, telling you that you can’t use windowed functions in the context of an aggregate. And so we end up grouping by doing a DISTINCT. SELECT DISTINCT YEAR(OrderDate), FIRST_VALUE(TotalDue) OVER (PARTITION BY YEAR(OrderDate) ORDER BY OrderDate, SalesOrderID RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING), LAST_VALUE(TotalDue) OVER (PARTITION BY YEAR(OrderDate) ORDER BY OrderDate, SalesOrderID RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING), PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY TotalDue) OVER (PARTITION BY YEAR(OrderDate)), PERCENTILE_DISC(0.95) WITHIN GROUP (ORDER BY TotalDue) OVER (PARTITION BY YEAR(OrderDate)) FROM Sales.SalesOrderHeader ; I’m sorry. It’s just the way it goes. Hopefully it’ll change the future, but for now, it’s what you’ll have to do. If we look in the execution plan, we see that it’s incredibly ugly, and actually works out the results of these analytic functions for all 31465 rows, finally performing the distinct operation to convert it into the four rows we get in the results. You might be able to achieve a better plan using things like TOP, or the kind of calculation that I used in http://sqlblog.com/blogs/rob_farley/archive/2011/08/23/t-sql-thoughts-about-the-95th-percentile.aspx (which is how PERCENTILE_CONT works), but it’s definitely convenient to use these functions, and in time, I’m sure we’ll see good improvements in the way that they are implemented. Oh, and this post should be good for fellow SQL Server MVP Nigel Sammy’s T-SQL Tuesday this month.

Read the article

Using a "white list" for extracting terms for Text Mining

- by [email protected]

In Part 1 of my post on "Generating cluster names from a document clustering model" (part 1, part 2, part 3), I showed how to build a clustering model from text documents using Oracle Data Miner, which automates preparing data for text mining. In this process we specified a custom stoplist and lexer and relied on Oracle Text to identify important terms. However, there is an alternative approach, the white list, which uses a thesaurus object with the Oracle Text CTXRULE index to allow you to specify the important terms. INTRODUCTIONA stoplist is used to exclude, i.e., black list, specific words in your documents from being indexed. For example, words like a, if, and, or, and but normally add no value when text mining. Other words can also be excluded if they do not help to differentiate documents, e.g., the word Oracle is ubiquitous in the Oracle product literature. One problem with stoplists is determining which words to specify. This usually requires inspecting the terms that are extracted, manually identifying which ones you don't want, and then re-indexing the documents to determine if you missed any. Since a corpus of documents could contain thousands of words, this could be a tedious exercise. Moreover, since every word is considered as an individual token, a term excluded in one context may be needed to help identify a term in another context. For example, in our Oracle product literature example, the words "Oracle Data Mining" taken individually are not particular helpful. The term "Oracle" may be found in nearly all documents, as with the term "Data." The term "Mining" is more unique, but could also refer to the Mining industry. If we exclude "Oracle" and "Data" by specifying them in the stoplist, we lose valuable information. But it we include them, they may introduce too much noise. Still, when you have a broad vocabulary or don't have a list of specific terms of interest, you rely on the text engine to identify important terms, often by computing the term frequency - inverse document frequency metric. (This is effectively a weight associated with each term indicating its relative importance in a document within a collection of documents. We'll revisit this later.) The results using this technique is often quite valuable. As noted above, an alternative to the subtractive nature of the stoplist is to specify a white list, or a list of terms--perhaps multi-word--that we want to extract and use for data mining. The obvious downside to this approach is the need to specify the set of terms of interest. However, this may not be as daunting a task as it seems. For example, in a given domain (Oracle product literature), there is often a recognized glossary, or a list of keywords and phrases (Oracle product names, industry names, product categories, etc.). Being able to identify multi-word terms, e.g., "Oracle Data Mining" or "Customer Relationship Management" as a single token can greatly increase the quality of the data mining results. The remainder of this post and subsequent posts will focus on how to produce a dataset that contains white list terms, suitable for mining. CREATING A WHITE LIST We'll leverage the thesaurus capability of Oracle Text. Using a thesaurus, we create a set of rules that are in effect our mapping from single and multi-word terms to the tokens used to represent those terms. For example, "Oracle Data Mining" becomes "ORACLEDATAMINING." First, we'll create and populate a mapping table called my_term_token_map. All text has been converted to upper case and values in the TERM column are intended to be mapped to the token in the TOKEN column. TERM TOKEN DATA MINING DATAMINING ORACLE DATA MINING ORACLEDATAMINING 11G ORACLE11G JAVA JAVA CRM CRM CUSTOMER RELATIONSHIP MANAGEMENT CRM ... Next, we'll create a thesaurus object my_thesaurus and a rules table my_thesaurus_rules: CTX_THES.CREATE_THESAURUS('my_thesaurus', FALSE); CREATE TABLE my_thesaurus_rules (main_term VARCHAR2(100), query_string VARCHAR2(400)); We next populate the thesaurus object and rules table using the term token map. A cursor is defined over my_term_token_map. As we iterate over the rows, we insert a synonym relationship 'SYN' into the thesaurus. We also insert into the table my_thesaurus_rules the main term, and the corresponding query string, which specifies synonyms for the token in the thesaurus. DECLARE cursor c2 is select token, term from my_term_token_map; BEGIN for r_c2 in c2 loop CTX_THES.CREATE_RELATION('my_thesaurus',r_c2.token,'SYN',r_c2.term); EXECUTE IMMEDIATE 'insert into my_thesaurus_rules values (:1,''SYN(' || r_c2.token || ', my_thesaurus)'')' using r_c2.token; end loop; END; We are effectively inserting the token to return and the corresponding query that will look up synonyms in our thesaurus into the my_thesaurus_rules table, for example: 'ORACLEDATAMINING' SYN ('ORACLEDATAMINING', my_thesaurus)At this point, we create a CTXRULE index on the my_thesaurus_rules table: create index my_thesaurus_rules_idx on my_thesaurus_rules(query_string) indextype is ctxsys.ctxrule; In my next post, this index will be used to extract the tokens that match each of the rules specified. We'll then compute the tf-idf weights for each of the terms and create a nested table suitable for mining.

Read the article

Listing common SQL Code Smells.

- by Phil Factor

Once you’ve done a number of SQL Code-reviews, you’ll know those signs in the code that all might not be well. These ’Code Smells’ are coding styles that don’t directly cause a bug, but are indicators that all is not well with the code. . Kent Beck and Massimo Arnoldi seem to have coined the phrase in the "OnceAndOnlyOnce" page of www.C2.com, where Kent also said that code "wants to be simple". Bad Smells in Code was an essay by Kent Beck and Martin Fowler, published as Chapter 3 of the book ‘Refactoring: Improving the Design of Existing Code’ (ISBN 978-0201485677) Although there are generic code-smells, SQL has its own particular coding habits that will alert the programmer to the need to re-factor what has been written. See Exploring Smelly Code and Code Deodorants for Code Smells by Nick Harrison for a grounding in Code Smells in C# I’ve always been tempted by the idea of automating a preliminary code-review for SQL. It would be so useful to trawl through code and pick up the various problems, much like the classic ‘Lint’ did for C, and how the Code Metrics plug-in for .NET Reflector by Jonathan 'Peli' de Halleux is used for finding Code Smells in .NET code. The problem is that few of the standard procedural code smells are relevant to SQL, and we need an agreed list of code smells. Merrilll Aldrich made a grand start last year in his blog Top 10 T-SQL Code Smells.However, I'd like to make a start by discovering if there is a general opinion amongst Database developers what the most important SQL Smells are. One can be a bit defensive about code smells. I will cheerfully write very long stored procedures, even though they are frowned on. I’ll use dynamic SQL occasionally. You can only use them as an aid for your own judgment and it is fine to ‘sign them off’ as being appropriate in particular circumstances. Also, whole classes of ‘code smells’ may be irrelevant for a particular database. The use of proprietary SQL, for example, is only a ‘code smell’ if there is a chance that the database will have to be ported to another RDBMS. The use of dynamic SQL is a risk only with certain security models. As the saying goes, a CodeSmell is a hint of possible bad practice to a pragmatist, but a sure sign of bad practice to a purist. Plamen Ratchev’s wonderful article Ten Common SQL Programming Mistakes lists some of these ‘code smells’ along with out-and-out mistakes, but there are more. The use of nested transactions, for example, isn’t entirely incorrect, even though the database engine ignores all but the outermost: but it does flag up the possibility that the programmer thinks that nested transactions are supported. If anything requires some sort of general agreement, the definition of code smells is one. I’m therefore going to make this Blog ‘dynamic, in that, if anyone twitters a suggestion with a #SQLCodeSmells tag (or sends me a twitter) I’ll update the list here. If you add a comment to the blog with a suggestion of what should be added or removed, I’ll do my best to oblige. In other words, I’ll try to keep this blog up to date. The name against each 'smell' is the name of the person who Twittered me, commented about or who has written about the 'smell'. it does not imply that they were the first ever to think of the smell! Use of deprecated syntax such as *= (Dave Howard) Denormalisation that requires the shredding of the contents of columns. (Merrill Aldrich) Contrived interfaces Use of deprecated datatypes such as TEXT/NTEXT (Dave Howard) Datatype mis-matches in predicates that rely on implicit conversion.(Plamen Ratchev) Using Correlated subqueries instead of a join (Dave_Levy/ Plamen Ratchev) The use of Hints in queries, especially NOLOCK (Dave Howard /Mike Reigler) Few or No comments. Use of functions in a WHERE clause. (Anil Das) Overuse of scalar UDFs (Dave Howard, Plamen Ratchev) Excessive ‘overloading’ of routines. The use of Exec xp_cmdShell (Merrill Aldrich) Excessive use of brackets. (Dave Levy) Lack of the use of a semicolon to terminate statements Use of non-SARGable functions on indexed columns in predicates (Plamen Ratchev) Duplicated code, or strikingly similar code. Misuse of SELECT * (Plamen Ratchev) Overuse of Cursors (Everyone. Special mention to Dave Levy & Adrian Hills) Overuse of CLR routines when not necessary (Sam Stange) Same column name in different tables with different datatypes. (Ian Stirk) Use of ‘broken’ functions such as ‘ISNUMERIC’ without additional checks. Excessive use of the WHILE loop (Merrill Aldrich) INSERT ... EXEC (Merrill Aldrich) The use of stored procedures where a view is sufficient (Merrill Aldrich) Not using two-part object names (Merrill Aldrich) Using INSERT INTO without specifying the columns and their order (Merrill Aldrich) Full outer joins even when they are not needed. (Plamen Ratchev) Huge stored procedures (hundreds/thousands of lines). Stored procedures that can produce different columns, or order of columns in their results, depending on the inputs. Code that is never used. Complex and nested conditionals WHILE (not done) loops without an error exit. Variable name same as the Datatype Vague identifiers. Storing complex data or list in a character map, bitmap or XML field User procedures with sp_ prefix (Aaron Bertrand)Views that reference views that reference views that reference views (Aaron Bertrand) Inappropriate use of sql_variant (Neil Hambly) Errors with identity scope using SCOPE_IDENTITY @@IDENTITY or IDENT_CURRENT (Neil Hambly, Aaron Bertrand) Schemas that involve multiple dated copies of the same table instead of partitions (Matt Whitfield-Atlantis UK) Scalar UDFs that do data lookups (poor man's join) (Matt Whitfield-Atlantis UK) Code that allows SQL Injection (Mladen Prajdic) Tables without clustered indexes (Matt Whitfield-Atlantis UK) Use of "SELECT DISTINCT" to mask a join problem (Nick Harrison) Multiple stored procedures with nearly identical implementation. (Nick Harrison) Excessive column aliasing may point to a problem or it could be a mapping implementation. (Nick Harrison) Joining "too many" tables in a query. (Nick Harrison) Stored procedure returning more than one record set. (Nick Harrison) A NOT LIKE condition (Nick Harrison) excessive "OR" conditions. (Nick Harrison) User procedures with sp_ prefix (Aaron Bertrand) Views that reference views that reference views that reference views (Aaron Bertrand) sp_OACreate or anything related to it (Bill Fellows) Prefixing names with tbl_, vw_, fn_, and usp_ ('tibbling') (Jeremiah Peschka) Aliases that go a,b,c,d,e... (Dave Levy/Diane McNurlan) Overweight Queries (e.g. 4 inner joins, 8 left joins, 4 derived tables, 10 subqueries, 8 clustered GUIDs, 2 UDFs, 6 case statements = 1 query) (Robert L Davis) Order by 3,2 (Dave Levy) MultiStatement Table functions which are then filtered 'Sel * from Udf() where Udf.Col = Something' (Dave Ballantyne) running a SQL 2008 system in SQL 2000 compatibility mode(John Stafford)

Read the article

Is Berkeley DB a NoSQL solution?

- by Gregory Burd

Berkeley DB is a library. To use it to store data you must link the library into your application. You can use most programming languages to access the API, the calls across these APIs generally mimic the Berkeley DB C-API which makes perfect sense because Berkeley DB is written in C. The inspiration for Berkeley DB was the DBM library, a part of the earliest versions of UNIX written by AT&T's Ken Thompson in 1979. DBM was a simple key/value hashtable-based storage library. In the early 1990s as BSD UNIX was transitioning from version 4.3 to 4.4 and retrofitting commercial code owned by AT&T with unencumbered code, it was the future founders of Sleepycat Software who wrote libdb (aka Berkeley DB) as the replacement for DBM. The problem it addressed was fast, reliable local key/value storage. At that time databases almost always lived on a single node, even the most sophisticated databases only had simple fail-over two node solutions. If you had a lot of data to store you would choose between the few commercial RDBMS solutions or to write your own custom solution. Berkeley DB took the headache out of the custom approach. These basic market forces inspired other DBM implementations. There was the "New DBM" (ndbm) and the "GNU DBM" (GDBM) and a few others, but the theme was the same. Even today TokyoCabinet calls itself "a modern implementation of DBM" mimicking, and improving on, something first created over thirty years ago. In the mid-1990s, DBM was the name for what you needed if you were looking for fast, reliable local storage. Fast forward to today. What's changed? Systems are connected over fast, very reliable networks. Disks are cheep, fast, and capable of storing huge amounts of data. CPUs continued to follow Moore's Law, processing power that filled a room in 1990 now fits in your pocket. PCs, servers, and other computers proliferated both in business and the personal markets. In addition to the new hardware entire markets, social systems, and new modes of interpersonal communication moved onto the web and started evolving rapidly. These changes cause a massive explosion of data and a need to analyze and understand that data. Taken together this resulted in an entirely different landscape for database storage, new solutions were needed. A number of novel solutions stepped up and eventually a category called NoSQL emerged. The new market forces inspired the CAP theorem and the heated debate of BASE vs. ACID. But in essence this was simply the market looking at what to trade off to meet these new demands. These new database systems shared many qualities in common. There were designed to address massive amounts of data, millions of requests per second, and scale out across multiple systems. The first large-scale and successful solution was Dynamo, Amazon's distributed key/value database. Dynamo essentially took the next logical step and added a twist. Dynamo was to be the database of record, it would be distributed, data would be partitioned across many nodes, and it would tolerate failure by avoiding single points of failure. Amazon did this because they recognized that the majority of the dynamic content they provided to customers visiting their web store front didn't require the services of an RDBMS. The queries were simple, key/value look-ups or simple range queries with only a few queries that required more complex joins. They set about to use relational technology only in places where it was the best solution for the task, places like accounting and order fulfillment, but not in the myriad of other situations. The success of Dynamo, and it's design, inspired the next generation of Non-SQL, distributed database solutions including Cassandra, Riak and Voldemort. The problem their designers set out to solve was, "reliability at massive scale" so the first focal point was distributed database algorithms. Underneath Dynamo there is a local transactional database; either Berkeley DB, Berkeley DB Java Edition, MySQL or an in-memory key/value data structure. Dynamo was an evolution of local key/value storage onto networks. Cassandra, Riak, and Voldemort all faced similar design decisions and one, Voldemort, choose Berkeley DB Java Edition for it's node-local storage. Riak at first was entirely in-memory, but has recently added write-once, append-only log-based on-disk storage similar type of storage as Berkeley DB except that it is based on a hash table which must reside entirely in-memory rather than a btree which can live in-memory or on disk. Berkeley DB evolved too, we added high availability (HA) and a replication manager that makes it easy to setup replica groups. Berkeley DB's replication doesn't partitioned the data, every node keeps an entire copy of the database. For consistency, there is a single node where writes are committed first - a master - then those changes are delivered to the replica nodes as log records. Applications can choose to wait until all nodes are consistent, or fire and forget allowing Berkeley DB to eventually become consistent. Berkeley DB's HA scales-out quite well for read-intensive applications and also effectively eliminates the central point of failure by allowing replica nodes to be elected (using a PAXOS algorithm) to mastership if the master should fail. This implementation covers a wide variety of use cases. MemcacheDB is a server that implements the Memcache network protocol but uses Berkeley DB for storage and HA to replicate the cache state across all the nodes in the cache group. Google Accounts, the user authentication layer for all Google properties, was until recently running Berkeley DB HA. That scaled to a globally distributed system. That said, most NoSQL solutions try to partition (shard) data across nodes in the replication group and some allow writes as well as reads at any node, Berkeley DB HA does not. So, is Berkeley DB a "NoSQL" solution? Not really, but it certainly is a component of many of the existing NoSQL solutions out there. Forgetting all the noise about how NoSQL solutions are complex distributed databases when you boil them down to a single node you still have to store the data to some form of stable local storage. DBMs solved that problem a long time ago. NoSQL has more to do with the layers on top of the DBM; the distributed, sometimes-consistent, partitioned, scale-out storage that manage key/value or document sets and generally have some form of simple HTTP/REST-style network API. Does Berkeley DB do that? Not really. Is Berkeley DB a "NoSQL" solution today? Nope, but it's the most robust solution on which to build such a system. Re-inventing the node-local data storage isn't easy. A lot of people are starting to come to appreciate the sophisticated features found in Berkeley DB, even mimic them in some cases. Could Berkeley DB grow into a NoSQL solution? Absolutely. Our key/value API could be extended over the net using any of a number of existing network protocols such as memcache or HTTP/REST. We could adapt our node-local data partitioning out over replicated nodes. We even have a nice query language and cost-based query optimizer in our BDB XML product that we could reuse were we to build out a document-based NoSQL-style product. XML and JSON are not so different that we couldn't adapt one to work with the other interchangeably. Without too much effort we could add what's missing, we could jump into this No SQL market withing a single product development cycle. Why isn't Berkeley DB already a NoSQL solution? Why aren't we working on it? Why indeed...

Read the article

value types in the vm

- by john.rose

value types in the vm p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px Times} p.p2 {margin: 0.0px 0.0px 14.0px 0.0px; font: 14.0px Times} p.p3 {margin: 0.0px 0.0px 12.0px 0.0px; font: 14.0px Times} p.p4 {margin: 0.0px 0.0px 15.0px 0.0px; font: 14.0px Times} p.p5 {margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px Courier} p.p6 {margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px Courier; min-height: 17.0px} p.p7 {margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px Times; min-height: 18.0px} p.p8 {margin: 0.0px 0.0px 0.0px 36.0px; text-indent: -36.0px; font: 14.0px Times; min-height: 18.0px} p.p9 {margin: 0.0px 0.0px 12.0px 0.0px; font: 14.0px Times; min-height: 18.0px} p.p10 {margin: 0.0px 0.0px 12.0px 0.0px; font: 14.0px Times; color: #000000} li.li1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px Times} li.li7 {margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px Times; min-height: 18.0px} span.s1 {font: 14.0px Courier} span.s2 {color: #000000} span.s3 {font: 14.0px Courier; color: #000000} ol.ol1 {list-style-type: decimal} Or, enduring values for a changing world. Introduction A value type is a data type which, generally speaking, is designed for being passed by value in and out of methods, and stored by value in data structures. The only value types which the Java language directly supports are the eight primitive types. Java indirectly and approximately supports value types, if they are implemented in terms of classes. For example, both Integer and String may be viewed as value types, especially if their usage is restricted to avoid operations appropriate to Object. In this note, we propose a definition of value types in terms of a design pattern for Java classes, accompanied by a set of usage restrictions. We also sketch the relation of such value types to tuple types (which are a JVM-level notion), and point out JVM optimizations that can apply to value types. This note is a thought experiment to extend the JVM’s performance model in support of value types. The demonstration has two phases. Initially the extension can simply use design patterns, within the current bytecode architecture, and in today’s Java language. But if the performance model is to be realized in practice, it will probably require new JVM bytecode features, changes to the Java language, or both. We will look at a few possibilities for these new features. An Axiom of Value In the context of the JVM, a value type is a data type equipped with construction, assignment, and equality operations, and a set of typed components, such that, whenever two variables of the value type produce equal corresponding values for their components, the values of the two variables cannot be distinguished by any JVM operation. Here are some corollaries: A value type is immutable, since otherwise a copy could be constructed and the original could be modified in one of its components, allowing the copies to be distinguished. Changing the component of a value type requires construction of a new value. The equals and hashCode operations are strictly component-wise. If a value type is represented by a JVM reference, that reference cannot be successfully synchronized on, and cannot be usefully compared for reference equality. A value type can be viewed in terms of what it doesn’t do. We can say that a value type omits all value-unsafe operations, which could violate the constraints on value types. These operations, which are ordinarily allowed for Java object types, are pointer equality comparison (the acmp instruction), synchronization (the monitor instructions), all the wait and notify methods of class Object, and non-trivial finalize methods. The clone method is also value-unsafe, although for value types it could be treated as the identity function. Finally, and most importantly, any side effect on an object (however visible) also counts as an value-unsafe operation. A value type may have methods, but such methods must not change the components of the value. It is reasonable and useful to define methods like toString, equals, and hashCode on value types, and also methods which are specifically valuable to users of the value type. Representations of Value Value types have two natural representations in the JVM, unboxed and boxed. An unboxed value consists of the components, as simple variables. For example, the complex number x=(1+2i), in rectangular coordinate form, may be represented in unboxed form by the following pair of variables: /*Complex x = Complex.valueOf(1.0, 2.0):*/ double x_re = 1.0, x_im = 2.0; These variables might be locals, parameters, or fields. Their association as components of a single value is not defined to the JVM. Here is a sample computation which computes the norm of the difference between two complex numbers: double distance(/*Complex x:*/ double x_re, double x_im, /*Complex y:*/ double y_re, double y_im) { /*Complex z = x.minus(y):*/ double z_re = x_re - y_re, z_im = x_im - y_im; /*return z.abs():*/ return Math.sqrt(z_re*z_re + z_im*z_im); } A boxed representation groups component values under a single object reference. The reference is to a ‘wrapper class’ that carries the component values in its fields. (A primitive type can naturally be equated with a trivial value type with just one component of that type. In that view, the wrapper class Integer can serve as a boxed representation of value type int.) The unboxed representation of complex numbers is practical for many uses, but it fails to cover several major use cases: return values, array elements, and generic APIs. The two components of a complex number cannot be directly returned from a Java function, since Java does not support multiple return values. The same story applies to array elements: Java has no ’array of structs’ feature. (Double-length arrays are a possible workaround for complex numbers, but not for value types with heterogeneous components.) By generic APIs I mean both those which use generic types, like Arrays.asList and those which have special case support for primitive types, like String.valueOf and PrintStream.println. Those APIs do not support unboxed values, and offer some problems to boxed values. Any ’real’ JVM type should have a story for returns, arrays, and API interoperability. The basic problem here is that value types fall between primitive types and object types. Value types are clearly more complex than primitive types, and object types are slightly too complicated. Objects are a little bit dangerous to use as value carriers, since object references can be compared for pointer equality, and can be synchronized on. Also, as many Java programmers have observed, there is often a performance cost to using wrapper objects, even on modern JVMs. Even so, wrapper classes are a good starting point for talking about value types. If there were a set of structural rules and restrictions which would prevent value-unsafe operations on value types, wrapper classes would provide a good notation for defining value types. This note attempts to define such rules and restrictions. Let’s Start Coding Now it is time to look at some real code. Here is a definition, written in Java, of a complex number value type. @ValueSafe public final class Complex implements java.io.Serializable { // immutable component structure: public final double re, im; private Complex(double re, double im) { this.re = re; this.im = im; } // interoperability methods: public String toString() { return "Complex("+re+","+im+")"; } public List<Double> asList() { return Arrays.asList(re, im); } public boolean equals(Complex c) { return re == c.re && im == c.im; } public boolean equals(@ValueSafe Object x) { return x instanceof Complex && equals((Complex) x); } public int hashCode() { return 31*Double.valueOf(re).hashCode() + Double.valueOf(im).hashCode(); } // factory methods: public static Complex valueOf(double re, double im) { return new Complex(re, im); } public Complex changeRe(double re2) { return valueOf(re2, im); } public Complex changeIm(double im2) { return valueOf(re, im2); } public static Complex cast(@ValueSafe Object x) { return x == null ? ZERO : (Complex) x; } // utility methods and constants: public Complex plus(Complex c) { return new Complex(re+c.re, im+c.im); } public Complex minus(Complex c) { return new Complex(re-c.re, im-c.im); } public double abs() { return Math.sqrt(re*re + im*im); } public static final Complex PI = valueOf(Math.PI, 0.0); public static final Complex ZERO = valueOf(0.0, 0.0); } This is not a minimal definition, because it includes some utility methods and other optional parts. The essential elements are as follows: The class is marked as a value type with an annotation. The class is final, because it does not make sense to create subclasses of value types. The fields of the class are all non-private and final. (I.e., the type is immutable and structurally transparent.) From the supertype Object, all public non-final methods are overridden. The constructor is private. Beyond these bare essentials, we can observe the following features in this example, which are likely to be typical of all value types: One or more factory methods are responsible for value creation, including a component-wise valueOf method. There are utility methods for complex arithmetic and instance creation, such as plus and changeIm. There are static utility constants, such as PI. The type is serializable, using the default mechanisms. There are methods for converting to and from dynamically typed references, such as asList and cast. The Rules In order to use value types properly, the programmer must avoid value-unsafe operations. A helpful Java compiler should issue errors (or at least warnings) for code which provably applies value-unsafe operations, and should issue warnings for code which might be correct but does not provably avoid value-unsafe operations. No such compilers exist today, but to simplify our account here, we will pretend that they do exist. A value-safe type is any class, interface, or type parameter marked with the @ValueSafe annotation, or any subtype of a value-safe type. If a value-safe class is marked final, it is in fact a value type. All other value-safe classes must be abstract. The non-static fields of a value class must be non-public and final, and all its constructors must be private. Under the above rules, a standard interface could be helpful to define value types like Complex. Here is an example: @ValueSafe public interface ValueType extends java.io.Serializable { // All methods listed here must get redefined. // Definitions must be value-safe, which means // they may depend on component values only. List<? extends Object> asList(); int hashCode(); boolean equals(@ValueSafe Object c); String toString(); } //@ValueSafe inherited from supertype: public final class Complex implements ValueType { … The main advantage of such a conventional interface is that (unlike an annotation) it is reified in the runtime type system. It could appear as an element type or parameter bound, for facilities which are designed to work on value types only. More broadly, it might assist the JVM to perform dynamic enforcement of the rules for value types. Besides types, the annotation @ValueSafe can mark fields, parameters, local variables, and methods. (This is redundant when the type is also value-safe, but may be useful when the type is Object or another supertype of a value type.) Working forward from these annotations, an expression E is defined as value-safe if it satisfies one or more of the following: The type of E is a value-safe type. E names a field, parameter, or local variable whose declaration is marked @ValueSafe. E is a call to a method whose declaration is marked @ValueSafe. E is an assignment to a value-safe variable, field reference, or array reference. E is a cast to a value-safe type from a value-safe expression. E is a conditional expression E0 ? E1 : E2, and both E1 and E2 are value-safe. Assignments to value-safe expressions and initializations of value-safe names must take their values from value-safe expressions. A value-safe expression may not be the subject of a value-unsafe operation. In particular, it cannot be synchronized on, nor can it be compared with the “==” operator, not even with a null or with another value-safe type. In a program where all of these rules are followed, no value-type value will be subject to a value-unsafe operation. Thus, the prime axiom of value types will be satisfied, that no two value type will be distinguishable as long as their component values are equal. More Code To illustrate these rules, here are some usage examples for Complex: Complex pi = Complex.valueOf(Math.PI, 0); Complex zero = pi.changeRe(0); //zero = pi; zero.re = 0; ValueType vtype = pi; @SuppressWarnings("value-unsafe") Object obj = pi; @ValueSafe Object obj2 = pi; obj2 = new Object(); // ok List<Complex> clist = new ArrayList<Complex>(); clist.add(pi); // (ok assuming List.add param is @ValueSafe) List<ValueType> vlist = new ArrayList<ValueType>(); vlist.add(pi); // (ok) List<Object> olist = new ArrayList<Object>(); olist.add(pi); // warning: "value-unsafe" boolean z = pi.equals(zero); boolean z1 = (pi == zero); // error: reference comparison on value type boolean z2 = (pi == null); // error: reference comparison on value type boolean z3 = (pi == obj2); // error: reference comparison on value type synchronized (pi) { } // error: synch of value, unpredictable result synchronized (obj2) { } // unpredictable result Complex qq = pi; qq = null; // possible NPE; warning: “null-unsafe" qq = (Complex) obj; // warning: “null-unsafe" qq = Complex.cast(obj); // OK @SuppressWarnings("null-unsafe") Complex empty = null; // possible NPE qq = empty; // possible NPE (null pollution) The Payoffs It follows from this that either the JVM or the java compiler can replace boxed value-type values with unboxed ones, without affecting normal computations. Fields and variables of value types can be split into their unboxed components. Non-static methods on value types can be transformed into static methods which take the components as value parameters. Some common questions arise around this point in any discussion of value types. Why burden the programmer with all these extra rules? Why not detect programs automagically and perform unboxing transparently? The answer is that it is easy to break the rules accidently unless they are agreed to by the programmer and enforced. Automatic unboxing optimizations are tantalizing but (so far) unreachable ideal. In the current state of the art, it is possible exhibit benchmarks in which automatic unboxing provides the desired effects, but it is not possible to provide a JVM with a performance model that assures the programmer when unboxing will occur. This is why I’m writing this note, to enlist help from, and provide assurances to, the programmer. Basically, I’m shooting for a good set of user-supplied “pragmas” to frame the desired optimization. Again, the important thing is that the unboxing must be done reliably, or else programmers will have no reason to work with the extra complexity of the value-safety rules. There must be a reasonably stable performance model, wherein using a value type has approximately the same performance characteristics as writing the unboxed components as separate Java variables. There are some rough corners to the present scheme. Since Java fields and array elements are initialized to null, value-type computations which incorporate uninitialized variables can produce null pointer exceptions. One workaround for this is to require such variables to be null-tested, and the result replaced with a suitable all-zero value of the value type. That is what the “cast” method does above. Generically typed APIs like List<T> will continue to manipulate boxed values always, at least until we figure out how to do reification of generic type instances. Use of such APIs will elicit warnings until their type parameters (and/or relevant members) are annotated or typed as value-safe. Retrofitting List<T> is likely to expose flaws in the present scheme, which we will need to engineer around. Here are a couple of first approaches: public interface java.util.List<@ValueSafe T> extends Collection<T> { … public interface java.util.List<T extends Object|ValueType> extends Collection<T> { … (The second approach would require disjunctive types, in which value-safety is “contagious” from the constituent types.) With more transformations, the return value types of methods can also be unboxed. This may require significant bytecode-level transformations, and would work best in the presence of a bytecode representation for multiple value groups, which I have proposed elsewhere under the title “Tuples in the VM”. But for starters, the JVM can apply this transformation under the covers, to internally compiled methods. This would give a way to express multiple return values and structured return values, which is a significant pain-point for Java programmers, especially those who work with low-level structure types favored by modern vector and graphics processors. The lack of multiple return values has a strong distorting effect on many Java APIs. Even if the JVM fails to unbox a value, there is still potential benefit to the value type. Clustered computing systems something have copy operations (serialization or something similar) which apply implicitly to command operands. When copying JVM objects, it is extremely helpful to know when an object’s identity is important or not. If an object reference is a copied operand, the system may have to create a proxy handle which points back to the original object, so that side effects are visible. Proxies must be managed carefully, and this can be expensive. On the other hand, value types are exactly those types which a JVM can “copy and forget” with no downside. Array types are crucial to bulk data interfaces. (As data sizes and rates increase, bulk data becomes more important than scalar data, so arrays are definitely accompanying us into the future of computing.) Value types are very helpful for adding structure to bulk data, so a successful value type mechanism will make it easier for us to express richer forms of bulk data. Unboxing arrays (i.e., arrays containing unboxed values) will provide better cache and memory density, and more direct data movement within clustered or heterogeneous computing systems. They require the deepest transformations, relative to today’s JVM. There is an impedance mismatch between value-type arrays and Java’s covariant array typing, so compromises will need to be struck with existing Java semantics. It is probably worth the effort, since arrays of unboxed value types are inherently more memory-efficient than standard Java arrays, which rely on dependent pointer chains. It may be sufficient to extend the “value-safe” concept to array declarations, and allow low-level transformations to change value-safe array declarations from the standard boxed form into an unboxed tuple-based form. Such value-safe arrays would not be convertible to Object[] arrays. Certain connection points, such as Arrays.copyOf and System.arraycopy might need additional input/output combinations, to allow smooth conversion between arrays with boxed and unboxed elements. Alternatively, the correct solution may have to wait until we have enough reification of generic types, and enough operator overloading, to enable an overhaul of Java arrays. Implicit Method Definitions The example of class Complex above may be unattractively complex. I believe most or all of the elements of the example class are required by the logic of value types. If this is true, a programmer who writes a value type will have to write lots of error-prone boilerplate code. On the other hand, I think nearly all of the code (except for the domain-specific parts like plus and minus) can be implicitly generated. Java has a rule for implicitly defining a class’s constructor, if no it defines no constructors explicitly. Likewise, there are rules for providing default access modifiers for interface members. Because of the highly regular structure of value types, it might be reasonable to perform similar implicit transformations on value types. Here’s an example of a “highly implicit” definition of a complex number type: public class Complex implements ValueType { // implicitly final public double re, im; // implicitly public final //implicit methods are defined elementwise from te fields: // toString, asList, equals(2), hashCode, valueOf, cast //optionally, explicit methods (plus, abs, etc.) would go here } In other words, with the right defaults, a simple value type definition can be a one-liner. The observant reader will have noticed the similarities (and suitable differences) between the explicit methods above and the corresponding methods for List<T>. Another way to abbreviate such a class would be to make an annotation the primary trigger of the functionality, and to add the interface(s) implicitly: public @ValueType class Complex { … // implicitly final, implements ValueType (But to me it seems better to communicate the “magic” via an interface, even if it is rooted in an annotation.) Implicitly Defined Value Types So far we have been working with nominal value types, which is to say that the sequence of typed components is associated with a name and additional methods that convey the intention of the programmer. A simple ordered pair of floating point numbers can be variously interpreted as (to name a few possibilities) a rectangular or polar complex number or Cartesian point. The name and the methods convey the intended meaning. But what if we need a truly simple ordered pair of floating point numbers, without any further conceptual baggage? Perhaps we are writing a method (like “divideAndRemainder”) which naturally returns a pair of numbers instead of a single number. Wrapping the pair of numbers in a nominal type (like “QuotientAndRemainder”) makes as little sense as wrapping a single return value in a nominal type (like “Quotient”). What we need here are structural value types commonly known as tuples. For the present discussion, let us assign a conventional, JVM-friendly name to tuples, roughly as follows: public class java.lang.tuple.$DD extends java.lang.tuple.Tuple { double $1, $2; } Here the component names are fixed and all the required methods are defined implicitly. The supertype is an abstract class which has suitable shared declarations. The name itself mentions a JVM-style method parameter descriptor, which may be “cracked” to determine the number and types of the component fields. The odd thing about such a tuple type (and structural types in general) is it must be instantiated lazily, in response to linkage requests from one or more classes that need it. The JVM and/or its class loaders must be prepared to spin a tuple type on demand, given a simple name reference, $xyz, where the xyz is cracked into a series of component types. (Specifics of naming and name mangling need some tasteful engineering.) Tuples also seem to demand, even more than nominal types, some support from the language. (This is probably because notations for non-nominal types work best as combinations of punctuation and type names, rather than named constructors like Function3 or Tuple2.) At a minimum, languages with tuples usually (I think) have some sort of simple bracket notation for creating tuples, and a corresponding pattern-matching syntax (or “destructuring bind”) for taking tuples apart, at least when they are parameter lists. Designing such a syntax is no simple thing, because it ought to play well with nominal value types, and also with pre-existing Java features, such as method parameter lists, implicit conversions, generic types, and reflection. That is a task for another day. Other Use Cases Besides complex numbers and simple tuples there are many use cases for value types. Many tuple-like types have natural value-type representations. These include rational numbers, point locations and pixel colors, and various kinds of dates and addresses. Other types have a variable-length ‘tail’ of internal values. The most common example of this is String, which is (mathematically) a sequence of UTF-16 character values. Similarly, bit vectors, multiple-precision numbers, and polynomials are composed of sequences of values. Such types include, in their representation, a reference to a variable-sized data structure (often an array) which (somehow) represents the sequence of values. The value type may also include ’header’ information. Variable-sized values often have a length distribution which favors short lengths. In that case, the design of the value type can make the first few values in the sequence be direct ’header’ fields of the value type. In the common case where the header is enough to represent the whole value, the tail can be a shared null value, or even just a null reference. Note that the tail need not be an immutable object, as long as the header type encapsulates it well enough. This is the case with String, where the tail is a mutable (but never mutated) character array. Field types and their order must be a globally visible part of the API. The structure of the value type must be transparent enough to have a globally consistent unboxed representation, so that all callers and callees agree about the type and order of components that appear as parameters, return types, and array elements. This is a trade-off between efficiency and encapsulation, which is forced on us when we remove an indirection enjoyed by boxed representations. A JVM-only transformation would not care about such visibility, but a bytecode transformation would need to take care that (say) the components of complex numbers would not get swapped after a redefinition of Complex and a partial recompile. Perhaps constant pool references to value types need to declare the field order as assumed by each API user. This brings up the delicate status of private fields in a value type. It must always be possible to load, store, and copy value types as coordinated groups, and the JVM performs those movements by moving individual scalar values between locals and stack. If a component field is not public, what is to prevent hostile code from plucking it out of the tuple using a rogue aload or astore instruction? Nothing but the verifier, so we may need to give it more smarts, so that it treats value types as inseparable groups of stack slots or locals (something like long or double). My initial thought was to make the fields always public, which would make the security problem moot. But public is not always the right answer; consider the case of String, where the underlying mutable character array must be encapsulated to prevent security holes. I believe we can win back both sides of the tradeoff, by training the verifier never to split up the components in an unboxed value. Just as the verifier encapsulates the two halves of a 64-bit primitive, it can encapsulate the the header and body of an unboxed String, so that no code other than that of class String itself can take apart the values. Similar to String, we could build an efficient multi-precision decimal type along these lines: public final class DecimalValue extends ValueType { protected final long header; protected private final BigInteger digits; public DecimalValue valueOf(int value, int scale) { assert(scale >= 0); return new DecimalValue(((long)value << 32) + scale, null); } public DecimalValue valueOf(long value, int scale) { if (value == (int) value) return valueOf((int)value, scale); return new DecimalValue(-scale, new BigInteger(value)); } } Values of this type would be passed between methods as two machine words. Small values (those with a significand which fits into 32 bits) would be represented without any heap data at all, unless the DecimalValue itself were boxed. (Note the tension between encapsulation and unboxing in this case. It would be better if the header and digits fields were private, but depending on where the unboxing information must “leak”, it is probably safer to make a public revelation of the internal structure.) Note that, although an array of Complex can be faked with a double-length array of double, there is no easy way to fake an array of unboxed DecimalValues. (Either an array of boxed values or a transposed pair of homogeneous arrays would be reasonable fallbacks, in a current JVM.) Getting the full benefit of unboxing and arrays will require some new JVM magic. Although the JVM emphasizes portability, system dependent code will benefit from using machine-level types larger than 64 bits. For example, the back end of a linear algebra package might benefit from value types like Float4 which map to stock vector types. This is probably only worthwhile if the unboxing arrays can be packed with such values. More Daydreams A more finely-divided design for dynamic enforcement of value safety could feature separate marker interfaces for each invariant. An empty marker interface Unsynchronizable could cause suitable exceptions for monitor instructions on objects in marked classes. More radically, a Interchangeable marker interface could cause JVM primitives that are sensitive to object identity to raise exceptions; the strangest result would be that the acmp instruction would have to be specified as raising an exception. @ValueSafe public interface ValueType extends java.io.Serializable, Unsynchronizable, Interchangeable { … public class Complex implements ValueType { // inherits Serializable, Unsynchronizable, Interchangeable, @ValueSafe … It seems possible that Integer and the other wrapper types could be retro-fitted as value-safe types. This is a major change, since wrapper objects would be unsynchronizable and their references interchangeable. It is likely that code which violates value-safety for wrapper types exists but is uncommon. It is less plausible to retro-fit String, since the prominent operation String.intern is often used with value-unsafe code. We should also reconsider the distinction between boxed and unboxed values in code. The design presented above obscures that distinction. As another thought experiment, we could imagine making a first class distinction in the type system between boxed and unboxed representations. Since only primitive types are named with a lower-case initial letter, we could define that the capitalized version of a value type name always refers to the boxed representation, while the initial lower-case variant always refers to boxed. For example: complex pi = complex.valueOf(Math.PI, 0); Complex boxPi = pi; // convert to boxed myList.add(boxPi); complex z = myList.get(0); // unbox Such a convention could perhaps absorb the current difference between int and Integer, double and Double. It might also allow the programmer to express a helpful distinction among array types. As said above, array types are crucial to bulk data interfaces, but are limited in the JVM. Extending arrays beyond the present limitations is worth thinking about; for example, the Maxine JVM implementation has a hybrid object/array type. Something like this which can also accommodate value type components seems worthwhile. On the other hand, does it make sense for value types to contain short arrays? And why should random-access arrays be the end of our design process, when bulk data is often sequentially accessed, and it might make sense to have heterogeneous streams of data as the natural “jumbo” data structure. These considerations must wait for another day and another note. More Work It seems to me that a good sequence for introducing such value types would be as follows: Add the value-safety restrictions to an experimental version of javac. Code some sample applications with value types, including Complex and DecimalValue. Create an experimental JVM which internally unboxes value types but does not require new bytecodes to do so. Ensure the feasibility of the performance model for the sample applications. Add tuple-like bytecodes (with or without generic type reification) to a major revision of the JVM, and teach the Java compiler to switch in the new bytecodes without code changes. A staggered roll-out like this would decouple language changes from bytecode changes, which is always a convenient thing. A similar investigation should be applied (concurrently) to array types. In this case, it seems to me that the starting point is in the JVM: Add an experimental unboxing array data structure to a production JVM, perhaps along the lines of Maxine hybrids. No bytecode or language support is required at first; everything can be done with encapsulated unsafe operations and/or method handles. Create an experimental JVM which internally unboxes value types but does not require new bytecodes to do so. Ensure the feasibility of the performance model for the sample applications. Add tuple-like bytecodes (with or without generic type reification) to a major revision of the JVM, and teach the Java compiler to switch in the new bytecodes without code changes. That’s enough musing me for now. Back to work!

Read the article

Using Table-Valued Parameters in SQL Server

- by Jesse

I work with stored procedures in SQL Server pretty frequently and have often found myself with a need to pass in a list of values at run-time. Quite often this list contains a set of ids on which the stored procedure needs to operate the size and contents of which are not known at design time. In the past I’ve taken the collection of ids (which are usually integers), converted them to a string representation where each value is separated by a comma and passed that string into a VARCHAR parameter of a stored procedure. The body of the stored procedure would then need to parse that string into a table variable which could be easily consumed with set-based logic within the rest of the stored procedure. This approach works pretty well but the VARCHAR variable has always felt like an un-wanted “middle man” in this scenario. Of course, I could use a BULK INSERT operation to load the list of ids into a temporary table that the stored procedure could use, but that approach seems heavy-handed in situations where the list of values is usually going to contain only a few dozen values. Fortunately SQL Server 2008 introduced the concept of table-valued parameters which effectively eliminates the need for the clumsy middle man VARCHAR parameter. Example: Customer Transaction Summary Report Let’s say we have a report that can summarize the the transactions that we’ve conducted with customers over a period of time. The report returns a pretty simple dataset containing one row per customer with some key metrics about how much business that customer has conducted over the date range for which the report is being run. Sometimes the report is run for a single customer, sometimes it’s run for all customers, and sometimes it’s run for a handful of customers (i.e. a salesman runs it for the customers that fall into his sales territory). This report can be invoked from a website on-demand, or it can be scheduled for periodic delivery to certain users via SQL Server Reporting Services. Because the report can be created from different places and the query to generate the report is complex it’s been packed into a stored procedure that accepts three parameters: @startDate – The beginning of the date range for which the report should be run. @endDate – The end of the date range for which the report should be run. @customerIds – The customer Ids for which the report should be run. Obviously, the @startDate and @endDate parameters are DATETIME variables. The @customerIds parameter, however, needs to contain a list of the identity values (primary key) from the Customers table representing the customers that were selected for this particular run of the report. In prior versions of SQL Server we might have made this parameter a VARCHAR variable, but with SQL Server 2008 we can make it into a table-valued parameter. Defining And Using The Table Type In order to use a table-valued parameter, we first need to tell SQL Server about what the table will look like. We do this by creating a user defined type. For the purposes of this stored procedure we need a very simple type to model a table variable with a single integer column. We can create a generic type called ‘IntegerListTableType’ like this: CREATE TYPE IntegerListTableType AS TABLE (Value INT NOT NULL) Once defined, we can use this new type to define the @customerIds parameter in the signature of our stored procedure. The parameter list for the stored procedure definition might look like: 1: CREATE PROCEDURE dbo.rpt_CustomerTransactionSummary 2: @starDate datetime, 3: @endDate datetime, 4: @customerIds IntegerListTableTableType READONLY Note the ‘READONLY’ statement following the declaration of the @customerIds parameter. SQL Server requires any table-valued parameter be marked as ‘READONLY’ and no DML (INSERT/UPDATE/DELETE) statements can be performed on a table-valued parameter within the routine in which it’s used. Aside from the DML restriction, however, you can do pretty much anything with a table-valued parameter as you could with a normal TABLE variable. With the user defined type and stored procedure defined as above, we could invoke like this: 1: DECLARE @cusomterIdList IntegerListTableType 2: INSERT @customerIdList VALUES (1) 3: INSERT @customerIdList VALUES (2) 4: INSERT @customerIdList VALUES (3) 5: 6: EXEC dbo.rpt_CustomerTransationSummary 7: @startDate = '2012-05-01', 8: @endDate = '2012-06-01' 9: @customerIds = @customerIdList Note that we can simply declare a variable of type ‘IntegerListTableType’ just like any other normal variable and insert values into it just like a TABLE variable. We could also populate the variable with a SELECT … INTO or INSERT … SELECT statement if desired. Using The Table-Valued Parameter With ADO .NET Invoking a stored procedure with a table-valued parameter from ADO .NET is as simple as building a DataTable and passing it in as the Value of a SqlParameter. Here’s some example code for how we would construct the SqlParameter for the @customerIds parameter in our stored procedure: 1: var customerIdsParameter = new SqlParameter(); 2: customerIdParameter.Direction = ParameterDirection.Input; 3: customerIdParameter.TypeName = "IntegerListTableType"; 4: customerIdParameter.Value = selectedCustomerIds.ToIntegerListDataTable("Value"); All we’re doing here is new’ing up an instance of SqlParameter, setting the pamameters direction, specifying the name of the User Defined Type that this parameter uses, and setting its value. We’re assuming here that we have an IEnumerable<int> variable called ‘selectedCustomerIds’ containing all of the customer Ids for which the report should be run. The ‘ToIntegerListDataTable’ method is an extension method of the IEnumerable<int> type that looks like this: 1: public static DataTable ToIntegerListDataTable(this IEnumerable<int> intValues, string columnName) 2: { 3: var intergerListDataTable = new DataTable(); 4: intergerListDataTable.Columns.Add(columnName); 5: foreach(var intValue in intValues) 6: { 7: var nextRow = intergerListDataTable.NewRow(); 8: nextRow[columnName] = intValue; 9: intergerListDataTable.Rows.Add(nextRow); 10: } 11: 12: return intergerListDataTable; 13: } Since the ‘IntegerListTableType’ has a single int column called ‘Value’, we pass that in for the ‘columnName’ parameter to the extension method. The method creates a new single-columned DataTable using the provided column name then iterates over the items in the IEnumerable<int> instance adding one row for each value. We can then use this SqlParameter instance when invoking the stored procedure just like we would use any other parameter. Advanced Functionality Using passing a list of integers into a stored procedure is a very simple usage scenario for the table-valued parameters feature, but I’ve found that it covers the majority of situations where I’ve needed to pass a collection of data for use in a query at run-time. I should note that BULK INSERT feature still makes sense for passing large amounts of data to SQL Server for processing. MSDN seems to suggest that 1000 rows of data is the tipping point where the overhead of a BULK INSERT operation can pay dividends. I should also note here that table-valued parameters can be used to deal with more complex data structures than single-columned tables of integers. A User Defined Type that backs a table-valued parameter can use things like identities and computed columns. That said, using some of these more advanced features might require the use the SqlDataRecord and SqlMetaData classes instead of a simple DataTable. Erland Sommarskog has a great article on his website that describes when and how to use these classes for table-valued parameters. What About Reporting Services? Earlier in the post I referenced the fact that our example stored procedure would be called from both a web application and a SQL Server Reporting Services report. Unfortunately, using table-valued parameters from SSRS reports can be a bit tricky and warrants its own blog post which I’ll be putting together and posting sometime in the near future.

Read the article

MySQL Cluster 7.2: Over 8x Higher Performance than Cluster 7.1

- by Mat Keep

0 0 1 893 5092 Homework 42 11 5974 14.0 Normal 0 false false false EN-US JA X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-parent:""; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin:0cm; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:12.0pt; font-family:Cambria; mso-ascii-font-family:Cambria; mso-ascii-theme-font:minor-latin; mso-hansi-font-family:Cambria; mso-hansi-theme-font:minor-latin; mso-ansi-language:EN-US;} Summary The scalability enhancements delivered by extensions to multi-threaded data nodes enables MySQL Cluster 7.2 to deliver over 8x higher performance than the previous MySQL Cluster 7.1 release on a recent benchmark What’s New in MySQL Cluster 7.2 MySQL Cluster 7.2 was released as GA (Generally Available) in February 2012, delivering many enhancements to performance on complex queries, new NoSQL Key / Value API, cross-data center replication and ease-of-use. These enhancements are summarized in the Figure below, and detailed in the MySQL Cluster New Features whitepaper Figure 1: Next Generation Web Services, Cross Data Center Replication and Ease-of-Use Once of the key enhancements delivered in MySQL Cluster 7.2 is extensions made to the multi-threading processes of the data nodes. Multi-Threaded Data Node Extensions The MySQL Cluster 7.2 data node is now functionally divided into seven thread types: 1) Local Data Manager threads (ldm). Note – these are sometimes also called LQH threads. 2) Transaction Coordinator threads (tc) 3) Asynchronous Replication threads (rep) 4) Schema Management threads (main) 5) Network receiver threads (recv) 6) Network send threads (send) 7) IO threads Each of these thread types are discussed in more detail below. MySQL Cluster 7.2 increases the maximum number of LDM threads from 4 to 16. The LDM contains the actual data, which means that when using 16 threads the data is more heavily partitioned (this is automatic in MySQL Cluster). Each LDM thread maintains its own set of data partitions, index partitions and REDO log. The number of LDM partitions per data node is not dynamically configurable, but it is possible, however, to map more than one partition onto each LDM thread, providing flexibility in modifying the number of LDM threads. The TC domain stores the state of in-flight transactions. This means that every new transaction can easily be assigned to a new TC thread. Testing has shown that in most cases 1 TC thread per 2 LDM threads is sufficient, and in many cases even 1 TC thread per 4 LDM threads is also acceptable. Testing also demonstrated that in some instances where the workload needed to sustain very high update loads it is necessary to configure 3 to 4 TC threads per 4 LDM threads. In the previous MySQL Cluster 7.1 release, only one TC thread was available. This limit has been increased to 16 TC threads in MySQL Cluster 7.2. The TC domain also manages the Adaptive Query Localization functionality introduced in MySQL Cluster 7.2 that significantly enhanced complex query performance by pushing JOIN operations down to the data nodes. Asynchronous Replication was separated into its own thread with the release of MySQL Cluster 7.1, and has not been modified in the latest 7.2 release. To scale the number of TC threads, it was necessary to separate the Schema Management domain from the TC domain. The schema management thread has little load, so is implemented with a single thread. The Network receiver domain was bound to 1 thread in MySQL Cluster 7.1. With the increase of threads in MySQL Cluster 7.2 it is also necessary to increase the number of recv threads to 8. This enables each receive thread to service one or more sockets used to communicate with other nodes the Cluster. The Network send thread is a new thread type introduced in MySQL Cluster 7.2. Previously other threads handled the sending operations themselves, which can provide for lower latency. To achieve highest throughput however, it has been necessary to create dedicated send threads, of which 8 can be configured. It is still possible to configure MySQL Cluster 7.2 to a legacy mode that does not use any of the send threads – useful for those workloads that are most sensitive to latency. The IO Thread is the final thread type and there have been no changes to this domain in MySQL Cluster 7.2. Multiple IO threads were already available, which could be configured to either one thread per open file, or to a fixed number of IO threads that handle the IO traffic. Except when using compression on disk, the IO threads typically have a very light load. Benchmarking the Scalability Enhancements The scalability enhancements discussed above have made it possible to scale CPU usage of each data node to more than 5x of that possible in MySQL Cluster 7.1. In addition, a number of bottlenecks have been removed, making it possible to scale data node performance by even more than 5x. Figure 2: MySQL Cluster 7.2 Delivers 8.4x Higher Performance than 7.1 The flexAsynch benchmark was used to compare MySQL Cluster 7.2 performance to 7.1 across an 8-node Intel Xeon x5670-based cluster of dual socket commodity servers (6 cores each). As the results demonstrate, MySQL Cluster 7.2 delivers over 8x higher performance per data nodes than MySQL Cluster 7.1. More details of this and other benchmarks will be published in a new whitepaper – coming soon, so stay tuned! In a following blog post, I’ll provide recommendations on optimum thread configurations for different types of server processor. You can also learn more from the Best Practices Guide to Optimizing Performance of MySQL Cluster Conclusion MySQL Cluster has achieved a range of impressive benchmark results, and set in context with the previous 7.1 release, is able to deliver over 8x higher performance per node. As a result, the multi-threaded data node extensions not only serve to increase performance of MySQL Cluster, they also enable users to achieve significantly improved levels of utilization from current and future generations of massively multi-core, multi-thread processor designs.

Read the article

The Shift: how Orchard painlessly shifted to document storage, and how it’ll affect you

- by Bertrand Le Roy

We’ve known it all along. The storage for Orchard content items would be much more efficient using a document database than a relational one. Orchard content items are composed of parts that serialize naturally into infoset kinds of documents. Storing them as relational data like we’ve done so far was unnatural and requires the data for a single item to span multiple tables, related through 1-1 relationships. This means lots of joins in queries, and a great potential for Select N+1 problems. Document databases, unfortunately, are still a tough sell in many places that prefer the more familiar relational model. Being able to x-copy Orchard to hosters has also been a basic constraint in the design of Orchard. Combine those with the necessity at the time to run in medium trust, and with license compatibility issues, and you’ll find yourself with very few reasonable choices. So we went, a little reluctantly, for relational SQL stores, with the dream of one day transitioning to document storage. We have played for a while with the idea of building our own document storage on top of SQL databases, and Sébastien implemented something more than decent along those lines, but we had a better way all along that we didn’t notice until recently… In Orchard, there are fields, which are named properties that you can add dynamically to a content part. Because they are so dynamic, we have been storing them as XML into a column on the main content item table. This infoset storage and its associated API are fairly generic, but were only used for fields. The breakthrough was when Sébastien realized how this existing storage could give us the advantages of document storage with minimal changes, while continuing to use relational databases as the substrate. public bool CommercialPrices { get { return this.Retrieve(p => p.CommercialPrices); } set { this.Store(p => p.CommercialPrices, value); } } This code is very compact and efficient because the API can infer from the expression what the type and name of the property are. It is then able to do the proper conversions for you. For this code to work in a content part, there is no need for a record at all. This is particularly nice for site settings: one query on one table and you get everything you need. This shows how the existing infoset solves the data storage problem, but you still need to query. Well, for those properties that need to be filtered and sorted on, you can still use the current record-based relational system. This of course continues to work. We do however provide APIs that make it trivial to store into both record properties and the infoset storage in one operation: public double Price { get { return Retrieve(r => r.Price); } set { Store(r => r.Price, value); } } This code looks strikingly similar to the non-record case above. The difference is that it will manage both the infoset and the record-based storages. The call to the Store method will send the data in both places, keeping them in sync. The call to the Retrieve method does something even cooler: if the property you’re looking for exists in the infoset, it will return it, but if it doesn’t, it will automatically look into the record for it. And if that wasn’t cool enough, it will take that value from the record and store it into the infoset for the next time it’s required. This means that your data will start automagically migrating to infoset storage just by virtue of using the code above instead of the usual: public double Price { get { return Record.Price; } set { Record.Price = value; } } As your users browse the site, it will get faster and faster as Select N+1 issues will optimize themselves away. If you preferred, you could still have explicit migration code, but it really shouldn’t be necessary most of the time. If you do already have code using QueryHints to mitigate Select N+1 issues, you might want to reconsider those, as with the new system, you’ll want to avoid joins that you don’t need for filtering or sorting, further optimizing your queries. There are some rare cases where the storage of the property must be handled differently. Check out this string[] property on SearchSettingsPart for example: public string[] SearchedFields { get { return (Retrieve<string>("SearchedFields") ?? "") .Split(new[] {',', ' '}, StringSplitOptions.RemoveEmptyEntries); } set { Store("SearchedFields", String.Join(", ", value)); } } The array of strings is transformed by the property accessors into and from a comma-separated list stored in a string. The Retrieve and Store overloads used in this case are lower-level versions that explicitly specify the type and name of the attribute to retrieve or store. You may be wondering what this means for code or operations that look directly at the database tables instead of going through the new infoset APIs. Even if there is a record, the infoset version of the property will win if it exists, so it is necessary to keep the infoset up-to-date. It’s not very complicated, but definitely something to keep in mind. Here is what a product record looks like in Nwazet.Commerce for example: And here is the same data in the infoset: The infoset is stored in Orchard_Framework_ContentItemRecord or Orchard_Framework_ContentItemVersionRecord, depending on whether the content type is versionable or not. A good way to find what you’re looking for is to inspect the record table first, as it’s usually easier to read, and then get the item record of the same id. Here is the detailed XML document for this product: <Data> <ProductPart Inventory="40" Price="18" Sku="pi-camera-box" OutOfStockMessage="" AllowBackOrder="false" Weight="0.2" Size="" ShippingCost="null" IsDigital="false" /> <ProductAttributesPart Attributes="" /> <AutoroutePart DisplayAlias="camera-box" /> <TitlePart Title="Nwazet Pi Camera Box" /> <BodyPart Text="[...]" /> <CommonPart CreatedUtc="2013-09-10T00:39:00Z" PublishedUtc="2013-09-14T01:07:47Z" /> </Data> The data is neatly organized under each part. It is easy to see how that document is all you need to know about that content item, all in one table. If you want to modify that data directly in the database, you should be careful to do it in both the record table and the infoset in the content item record. In this configuration, the record is now nothing more than an index, and will only be used for sorting and filtering. Of course, it’s perfectly fine to mix record-backed properties and record-less properties on the same part. It really depends what you think must be sorted and filtered on. In turn, this potentially simplifies migrations considerably. So here it is, the great shift of Orchard to document storage, something that Orchard has been designed for all along, and that we were able to implement with a satisfying and surprising economy of resources. Expect this code to make its way into the 1.8 version of Orchard when that’s available.

Read the article

Service Broker, not ETL

- by jamiet

I have been very quiet on this blog of late and one reason for that is I have been very busy on a client project that I would like to talk about a little here. The client that I have been working for has a website that runs on a distributed architecture utilising a messaging infrastructure for communication between different endpoints. My brief was to build a system that could consume these messages and produce analytical information in near-real-time. More specifically I basically had to deliver a data warehouse however it was the real-time aspect of the project that really intrigued me. This real-time requirement meant that using an Extract transformation, Load (ETL) tool was out of the question and so I had no choice but to write T-SQL code (i.e. stored-procedures) to process the incoming messages and load the data into the data warehouse. This concerned me though – I had no way to control the rate at which data would arrive into the system yet we were going to have end-users querying the system at the same time that those messages were arriving; the potential for contention in such a scenario was pretty high and and was something I wanted to minimise as much as possible. Moreover I did not want the processing of data inside the data warehouse to have any impact on the customer-facing website. As you have probably guessed from the title of this blog post this is where Service Broker stepped in! For those that have not heard of it Service Broker is a queuing technology that has been built into SQL Server since SQL Server 2005. It provides a number of features however the one that was of interest to me was the fact that it facilitates asynchronous data processing which, in layman’s terms, means the ability to process some data without requiring the system that supplied the data having to wait for the response. That was a crucial feature because on this project the customer-facing website (in effect an OLTP system) would be calling one of our stored procedures with each message – we did not want to cause the OLTP system to wait on us every time we processed one of those messages. This asynchronous nature also helps to alleviate the contention problem because the asynchronous processing activity is handled just like any other task in the database engine and hence can wait on another task (such as an end-user query). Service Broker it was then! The stored procedure called by the OLTP system would simply put the message onto a queue and we would use a feature called activation to pick each message off the queue in turn and process it into the warehouse. At the time of writing the system is not yet up to full capacity but so far everything seems to be working OK (touch wood) and crucially our users are seeing data in near-real-time. By near-real-time I am talking about latencies of a few minutes at most and to someone like me who is used to building systems that have overnight latencies that is a huge step forward! So then, am I advocating that you all go out and dump your ETL tools? Of course not, no! What this project has taught me though is that in certain scenarios there may be better ways to implement a data warehouse system then the traditional “load data in overnight” approach that we are all used to. Moreover I have really enjoyed getting to grips with a new technology and even if you don’t want to use Service Broker you might want to consider asynchronous messaging architectures for your BI/data warehousing solutions in the future. This has been a very high level overview of my use of Service Broker and I have deliberately left out much of the minutiae of what has been a very challenging implementation. Nonetheless I hope I have caused you to reflect upon your own approaches to BI and question whether other approaches may be more tenable. All comments and questions gratefully received! Lastly, if you have never used Service Broker before and want to kick the tyres I have provided below a very simple “Service Broker Hello World” script that will create all of the objects required to facilitate Service Broker communications and then send the message “Hello World” from one place to anther! This doesn’t represent a “proper” implementation per se because it doesn’t close down down conversation objects (which you should always do in a real-world scenario) but its enough to demonstrate the capabilities! @Jamiet ----------------------------------------------------------------------------------------------- /*This is a basic Service Broker Hello World app. Have fun! -Jamie */ USE MASTER GO CREATE DATABASE SBTest GO --Turn Service Broker on! ALTER DATABASE SBTest SET ENABLE_BROKER GO USE SBTest GO -- 1) we need to create a message type. Note that our message type is -- very simple and allowed any type of content CREATE MESSAGE TYPE HelloMessage VALIDATION = NONE GO -- 2) Once the message type has been created, we need to create a contract -- that specifies who can send what types of messages CREATE CONTRACT HelloContract (HelloMessage SENT BY INITIATOR) GO --We can query the metadata of the objects we just created SELECT * FROM sys.service_message_types WHERE name = 'HelloMessage'; SELECT * FROM sys.service_contracts WHERE name = 'HelloContract'; SELECT * FROM sys.service_contract_message_usages WHERE service_contract_id IN (SELECT service_contract_id FROM sys.service_contracts WHERE name = 'HelloContract') AND message_type_id IN (SELECT message_type_id FROM sys.service_message_types WHERE name = 'HelloMessage'); -- 3) The communication is between two endpoints. Thus, we need two queues to -- hold messages CREATE QUEUE SenderQueue CREATE QUEUE ReceiverQueue GO --more querying metatda SELECT * FROM sys.service_queues WHERE name IN ('SenderQueue','ReceiverQueue'); --we can also select from the queues as if they were tables SELECT * FROM SenderQueue SELECT * FROM ReceiverQueue -- 4) Create the required services and bind them to be above created queues CREATE SERVICE Sender ON QUEUE SenderQueue CREATE SERVICE Receiver ON QUEUE ReceiverQueue (HelloContract) GO --more querying metadata SELECT * FROM sys.services WHERE name IN ('Receiver','Sender'); -- 5) At this point, we can begin the conversation between the two services by -- sending messages DECLARE @conversationHandle UNIQUEIDENTIFIER DECLARE @message NVARCHAR(100) BEGIN BEGIN TRANSACTION; BEGIN DIALOG @conversationHandle FROM SERVICE Sender TO SERVICE 'Receiver' ON CONTRACT HelloContract WITH ENCRYPTION=OFF -- Send a message on the conversation SET @message = N'Hello, World'; SEND ON CONVERSATION @conversationHandle MESSAGE TYPE HelloMessage (@message) COMMIT TRANSACTION END GO --check contents of queues SELECT * FROM SenderQueue SELECT * FROM ReceiverQueue GO -- Receive a message from the queue RECEIVE CONVERT(NVARCHAR(MAX), message_body) AS MESSAGE FROM ReceiverQueue GO --If no messages were received and/or you can't see anything on the queues you may wish to check the following for clues: SELECT * FROM sys.transmission_queue -- Cleanup DROP SERVICE Sender DROP SERVICE Receiver DROP QUEUE SenderQueue DROP QUEUE ReceiverQueue DROP CONTRACT HelloContract DROP MESSAGE TYPE HelloMessage GO USE MASTER GO DROP DATABASE SBTest GO

Read the article

Extending Oracle CEP with Predictive Analytics

- by vikram.shukla(at)oracle.com

Introduction: OCEP is often used as a business rules engine to execute a set of business logic rules via CQL statements, and take decisions based on the outcome of those rules. There are times where configuring rules manually is sufficient because an application needs to deal with only a small and well-defined set of static rules. However, in many situations customers don't want to pre-define such rules for two reasons. First, they are dealing with events with lots of columns and manually crafting such rules for each column or a set of columns and combinations thereof is almost impossible. Second, they are content with probabilistic outcomes and do not care about 100% precision. The former is the case when a user is dealing with data with high dimensionality, the latter when an application can live with "false" positives as they can be discarded after further inspection, say by a Human Task component in a Business Process Management software. The primary goal of this blog post is to show how this can be achieved by combining OCEP with Oracle Data Mining® and leveraging the latter's rich set of algorithms and functionality to do predictive analytics in real time on streaming events. The secondary goal of this post is also to show how OCEP can be extended to invoke any arbitrary external computation in an RDBMS from within CEP. The extensible facility is known as the JDBC cartridge. The rest of the post describes the steps required to achieve this: We use the dataset available at http://blogs.oracle.com/datamining/2010/01/fraud_and_anomaly_detection_made_simple.html to showcase the capabilities. We use it to show how transaction anomalies or fraud can be detected. Building the model: Follow the self-explanatory steps described at the above URL to build the model. It is very simple - it uses built-in Oracle Data Mining PL/SQL packages to cleanse, normalize and build the model out of the dataset. You can also use graphical Oracle Data Miner® to build the models. To summarize, it involves: Specifying which algorithms to use. In this case we use Support Vector Machines as we're trying to find anomalies in highly dimensional dataset.Build model on the data in the table for the algorithms specified. For this example, the table was populated in the scott/tiger schema with appropriate privileges. Configuring the Data Source: This is the first step in building CEP application using such an integration. Our datasource looks as follows in the server config file. It is advisable that you use the Visualizer to add it to the running server dynamically, rather than manually edit the file. <data-source> <name>DataMining</name> <data-source-params> <jndi-names> <element>DataMining</element> </jndi-names> <global-transactions-protocol>OnePhaseCommit</global-transactions-protocol> </data-source-params> <connection-pool-params> <credential-mapping-enabled></credential-mapping-enabled> <test-table-name>SQL SELECT 1 from DUAL</test-table-name> <initial-capacity>1</initial-capacity> <max-capacity>15</max-capacity> <capacity-increment>1</capacity-increment> </connection-pool-params> <driver-params> <use-xa-data-source-interface>true</use-xa-data-source-interface> <driver-name>oracle.jdbc.OracleDriver</driver-name> <url>jdbc:oracle:thin:@localhost:1522:orcl</url> <properties> <element> <value>scott</value> <name>user</name> </element> <element> <value>{Salted-3DES}AzFE5dDbO2g=</value> <name>password</name> </element> <element> <name>com.bea.core.datasource.serviceName</name> <value>oracle11.2g</value> </element> <element> <name>com.bea.core.datasource.serviceVersion</name> <value>11.2.0</value> </element> <element> <name>com.bea.core.datasource.serviceObjectClass</name> <value>java.sql.Driver</value> </element> </properties> </driver-params> </data-source> Designing the EPN: The EPN is very simple in this example. We briefly describe each of the components. The adapter ("DataMiningAdapter") reads data from a .csv file and sends it to the CQL processor downstream. The event payload here is same as that of the table in the database (refer to the attached project or do a "desc table-name" from a SQL*PLUS prompt). While this is for convenience in this example, it need not be the case. One can still omit fields in the streaming events, and need not match all columns in the table on which the model was built. Better yet, it does not even need to have the same name as columns in the table, as long as you alias them in the USING clause of the mining function. (Caveat: they still need to draw values from a similar universe or domain, otherwise it constitutes incorrect usage of the model). There are two things in the CQL processor ("DataMiningProc") that make scoring possible on streaming events. 1. User defined cartridge function Please refer to the OCEP CQL reference manual to find more details about how to define such functions. We include the function below in its entirety for illustration. <?xml version="1.0" encoding="UTF-8"?> <jdbcctxconfig:config xmlns:jdbcctxconfig="http://www.bea.com/ns/wlevs/config/application" xmlns:jc="http://www.oracle.com/ns/ocep/config/jdbc"> <jc:jdbc-ctx> <name>Oracle11gR2</name> <data-source>DataMining</data-source> <function name="prediction2"> <param name="CQLMONTH" type="char"/> <param name="WEEKOFMONTH" type="int"/> <param name="DAYOFWEEK" type="char" /> <param name="MAKE" type="char" /> <param name="ACCIDENTAREA" type="char" /> <param name="DAYOFWEEKCLAIMED" type="char" /> <param name="MONTHCLAIMED" type="char" /> <param name="WEEKOFMONTHCLAIMED" type="int" /> <param name="SEX" type="char" /> <param name="MARITALSTATUS" type="char" /> <param name="AGE" type="int" /> <param name="FAULT" type="char" /> <param name="POLICYTYPE" type="char" /> <param name="VEHICLECATEGORY" type="char" /> <param name="VEHICLEPRICE" type="char" /> <param name="FRAUDFOUND" type="int" /> <param name="POLICYNUMBER" type="int" /> <param name="REPNUMBER" type="int" /> <param name="DEDUCTIBLE" type="int" /> <param name="DRIVERRATING" type="int" /> <param name="DAYSPOLICYACCIDENT" type="char" /> <param name="DAYSPOLICYCLAIM" type="char" /> <param name="PASTNUMOFCLAIMS" type="char" /> <param name="AGEOFVEHICLES" type="char" /> <param name="AGEOFPOLICYHOLDER" type="char" /> <param name="POLICEREPORTFILED" type="char" /> <param name="WITNESSPRESNT" type="char" /> <param name="AGENTTYPE" type="char" /> <param name="NUMOFSUPP" type="char" /> <param name="ADDRCHGCLAIM" type="char" /> <param name="NUMOFCARS" type="char" /> <param name="CQLYEAR" type="int" /> <param name="BASEPOLICY" type="char" /> <return-component-type>char</return-component-type> <sql><![CDATA[ SELECT to_char(PREDICTION_PROBABILITY(CLAIMSMODEL, '0' USING *)) AS probability FROM (SELECT :CQLMONTH AS MONTH, :WEEKOFMONTH AS WEEKOFMONTH, :DAYOFWEEK AS DAYOFWEEK, :MAKE AS MAKE, :ACCIDENTAREA AS ACCIDENTAREA, :DAYOFWEEKCLAIMED AS DAYOFWEEKCLAIMED, :MONTHCLAIMED AS MONTHCLAIMED, :WEEKOFMONTHCLAIMED, :SEX AS SEX, :MARITALSTATUS AS MARITALSTATUS, :AGE AS AGE, :FAULT AS FAULT, :POLICYTYPE AS POLICYTYPE, :VEHICLECATEGORY AS VEHICLECATEGORY, :VEHICLEPRICE AS VEHICLEPRICE, :FRAUDFOUND AS FRAUDFOUND, :POLICYNUMBER AS POLICYNUMBER, :REPNUMBER AS REPNUMBER, :DEDUCTIBLE AS DEDUCTIBLE, :DRIVERRATING AS DRIVERRATING, :DAYSPOLICYACCIDENT AS DAYSPOLICYACCIDENT, :DAYSPOLICYCLAIM AS DAYSPOLICYCLAIM, :PASTNUMOFCLAIMS AS PASTNUMOFCLAIMS, :AGEOFVEHICLES AS AGEOFVEHICLES, :AGEOFPOLICYHOLDER AS AGEOFPOLICYHOLDER, :POLICEREPORTFILED AS POLICEREPORTFILED, :WITNESSPRESNT AS WITNESSPRESENT, :AGENTTYPE AS AGENTTYPE, :NUMOFSUPP AS NUMOFSUPP, :ADDRCHGCLAIM AS ADDRCHGCLAIM, :NUMOFCARS AS NUMOFCARS, :CQLYEAR AS YEAR, :BASEPOLICY AS BASEPOLICY FROM dual) ]]> </sql> </function> </jc:jdbc-ctx> </jdbcctxconfig:config> 2. Invoking the function for each event. Once this function is defined, you can invoke it from CQL as follows: <?xml version="1.0" encoding="UTF-8"?> <wlevs:config xmlns:wlevs="http://www.bea.com/ns/wlevs/config/application"> <processor> <name>DataMiningProc</name> <rules> <query id="q1"><![CDATA[ ISTREAM(SELECT S.CQLMONTH, S.WEEKOFMONTH, S.DAYOFWEEK, S.MAKE, : S.BASEPOLICY, C.F AS probability FROM StreamDataChannel [NOW] AS S, TABLE(prediction2@Oracle11gR2(S.CQLMONTH, S.WEEKOFMONTH, S.DAYOFWEEK, S.MAKE, ..., S.BASEPOLICY) AS F of char) AS C) ]]></query> </rules> </processor> </wlevs:config> Finally, the last stage in the EPN prints out the probability of the event being an anomaly. One can also define a threshold in CQL to filter out events that are normal, i.e., below a certain mark as defined by the analyst or designer. Sample Runs: Now let's see how this behaves when events are streamed through CEP. We use only two events for brevity, one normal and other one not. This is one of the "normal" looking events and the probability of it being anomalous is less than 60%. Event is: eventType=DataMiningOutEvent object=q1 time=2904821976256 S.CQLMONTH=Dec, S.WEEKOFMONTH=5, S.DAYOFWEEK=Wednesday, S.MAKE=Honda, S.ACCIDENTAREA=Urban, S.DAYOFWEEKCLAIMED=Tuesday, S.MONTHCLAIMED=Jan, S.WEEKOFMONTHCLAIMED=1, S.SEX=Female, S.MARITALSTATUS=Single, S.AGE=21, S.FAULT=Policy Holder, S.POLICYTYPE=Sport - Liability, S.VEHICLECATEGORY=Sport, S.VEHICLEPRICE=more than 69000, S.FRAUDFOUND=0, S.POLICYNUMBER=1, S.REPNUMBER=12, S.DEDUCTIBLE=300, S.DRIVERRATING=1, S.DAYSPOLICYACCIDENT=more than 30, S.DAYSPOLICYCLAIM=more than 30, S.PASTNUMOFCLAIMS=none, S.AGEOFVEHICLES=3 years, S.AGEOFPOLICYHOLDER=26 to 30, S.POLICEREPORTFILED=No, S.WITNESSPRESENT=No, S.AGENTTYPE=External, S.NUMOFSUPP=none, S.ADDRCHGCLAIM=1 year, S.NUMOFCARS=3 to 4, S.CQLYEAR=1994, S.BASEPOLICY=Liability, probability=.58931702982118561 isTotalOrderGuarantee=true\nAnamoly probability: .58931702982118561 However, the following event is scored as an anomaly with a very high probability of 89%. So there is likely to be something wrong with it. A close look reveals that the value of "deductible" field (10000) is not "normal". What exactly constitutes normal here?. If you run the query on the database to find ALL distinct values for the "deductible" field, it returns the following set: {300, 400, 500, 700} Event is: eventType=DataMiningOutEvent object=q1 time=2598483773496 S.CQLMONTH=Dec, S.WEEKOFMONTH=5, S.DAYOFWEEK=Wednesday, S.MAKE=Honda, S.ACCIDENTAREA=Urban, S.DAYOFWEEKCLAIMED=Tuesday, S.MONTHCLAIMED=Jan, S.WEEKOFMONTHCLAIMED=1, S.SEX=Female, S.MARITALSTATUS=Single, S.AGE=21, S.FAULT=Policy Holder, S.POLICYTYPE=Sport - Liability, S.VEHICLECATEGORY=Sport, S.VEHICLEPRICE=more than 69000, S.FRAUDFOUND=0, S.POLICYNUMBER=1, S.REPNUMBER=12, S.DEDUCTIBLE=10000, S.DRIVERRATING=1, S.DAYSPOLICYACCIDENT=more than 30, S.DAYSPOLICYCLAIM=more than 30, S.PASTNUMOFCLAIMS=none, S.AGEOFVEHICLES=3 years, S.AGEOFPOLICYHOLDER=26 to 30, S.POLICEREPORTFILED=No, S.WITNESSPRESENT=No, S.AGENTTYPE=External, S.NUMOFSUPP=none, S.ADDRCHGCLAIM=1 year, S.NUMOFCARS=3 to 4, S.CQLYEAR=1994, S.BASEPOLICY=Liability, probability=.89171554529576691 isTotalOrderGuarantee=true\nAnamoly probability: .89171554529576691 Conclusion: By way of this example, we show: real-time scoring of events as they flow through CEP leveraging Oracle Data Mining.how CEP applications can invoke complex arbitrary external computations (function shipping) in an RDBMS.

Read the article

Big Data Matters with ODI12c

- by Madhu Nair

contributed by Mike Eisterer On October 17th, 2013, Oracle announced the release of Oracle Data Integrator 12c (ODI12c). This release signifies improvements to Oracle’s Data Integration portfolio of solutions, particularly Big Data integration. Why Big Data = Big Business Organizations are gaining greater insights and actionability through increased storage, processing and analytical benefits offered by Big Data solutions. New technologies and frameworks like HDFS, NoSQL, Hive and MapReduce support these benefits now. As further data is collected, analytical requirements increase and the complexity of managing transformations and aggregations of data compounds and organizations are in need for scalable Data Integration solutions. ODI12c provides enterprise solutions for the movement, translation and transformation of information and data heterogeneously and in Big Data Environments through: The ability for existing ODI and SQL developers to leverage new Big Data technologies. A metadata focused approach for cataloging, defining and reusing Big Data technologies, mappings and process executions. Integration between many heterogeneous environments and technologies such as HDFS and Hive. Generation of Hive Query Language. Working with Big Data using Knowledge Modules ODI12c provides developers with the ability to define sources and targets and visually develop mappings to effect the movement and transformation of data. As the mappings are created, ODI12c leverages a rich library of prebuilt integrations, known as Knowledge Modules (KMs). These KMs are contextual to the technologies and platforms to be integrated. Steps and actions needed to manage the data integration are pre-built and configured within the KMs. The Oracle Data Integrator Application Adapter for Hadoop provides a series of KMs, specifically designed to integrate with Big Data Technologies. The Big Data KMs include: Check Knowledge Module Reverse Engineer Knowledge Module Hive Transform Knowledge Module Hive Control Append Knowledge Module File to Hive (LOAD DATA) Knowledge Module File-Hive to Oracle (OLH-OSCH) Knowledge Module Nothing to beat an Example: To demonstrate the use of the KMs which are part of the ODI Application Adapter for Hadoop, a mapping may be defined to move data between files and Hive targets. The mapping is defined by dragging the source and target into the mapping, performing the attribute (column) mapping (see Figure 1) and then selecting the KM which will govern the process. In this mapping example, movie data is being moved from an HDFS source into a Hive table. Some of the attributes, such as “CUSTID to custid”, have been mapped over. Figure 1 Defining the Mapping Before the proper KM can be assigned to define the technology for the mapping, it needs to be added to the ODI project. The Big Data KMs have been made available to the project through the KM import process. Generally, this is done prior to defining the mapping. Figure 2 Importing the Big Data Knowledge Modules Following the import, the KMs are available in the Designer Navigator. v\:* {behavior:url(#default#VML);} o\:* {behavior:url(#default#VML);} w\:* {behavior:url(#default#VML);} .shape {behavior:url(#default#VML);} Normal 0 false false false EN-US ZH-TW X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Calibri","sans-serif"; mso-bidi-font-family:"Times New Roman";} Figure 3 The Project View in Designer, Showing Installed IKMs Once the KM is imported, it may be assigned to the mapping target. This is done by selecting the Physical View of the mapping and examining the Properties of the Target. In this case MOVIAPP_LOG_STAGE is the target of our mapping. Figure 4 Physical View of the Mapping and Assigning the Big Data Knowledge Module to the Target Alternative KMs may have been selected as well, providing flexibility and abstracting the logical mapping from the physical implementation. Our mapping may be applied to other technologies as well. The mapping is now complete and is ready to run. We will see more in a future blog about running a mapping to load Hive. To complete the quick ODI for Big Data Overview, let us take a closer look at what the IKM File to Hive is doing for us. ODI provides differentiated capabilities by defining the process and steps which normally would have to be manually developed, tested and implemented into the KM. As shown in figure 5, the KM is preparing the Hive session, managing the Hive tables, performing the initial load from HDFS and then performing the insert into Hive. HDFS and Hive options are selected graphically, as shown in the properties in Figure 4. Figure 5 Process and Steps Managed by the KM What’s Next Big Data being the shape shifting business challenge it is is fast evolving into the deciding factor between market leaders and others. Now that an introduction to ODI and Big Data has been provided, look for additional blogs coming soon using the Knowledge Modules which make up the Oracle Data Integrator Application Adapter for Hadoop: Importing Big Data Metadata into ODI, Testing Data Stores and Loading Hive Targets Generating Transformations using Hive Query language Loading Oracle from Hadoop Sources For more information now, please visit the Oracle Data Integrator Application Adapter for Hadoop web site, http://www.oracle.com/us/products/middleware/data-integration/hadoop/overview/index.html Do not forget to tune in to the ODI12c Executive Launch webcast on the 12th to hear more about ODI12c and GG12c. Normal 0 false false false EN-US ZH-TW X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Calibri","sans-serif"; mso-bidi-font-family:"Times New Roman";}

Read the article

The SPARC SuperCluster

- by Karoly Vegh

Oracle has been providing a lead in the Engineered Systems business for quite a while now, in accordance with the motto "Hardware and Software Engineered to Work Together." Indeed it is hard to find a better definition of these systems. Allow me to summarize the idea. It is: Build a compute platform optimized to run your technologies Develop application aware, intelligently caching storage components Take an impressively fast network technology interconnecting it with the compute nodes Tune the application to scale with the nodes to yet unseen performance Reduce the amount of data moving via compression Provide this all in a pre-integrated single product with a single-pane management interface All these ideas have been around in IT for quite some time now. The real Oracle advantage is adding the last one to put these all together. Oracle has built quite a portfolio of Engineered Systems, to run its technologies - and run those like they never ran before. In this post I'll focus on one of them that serves as a consolidation demigod, a multi-purpose engineered system. As you probably have guessed, I am talking about the SPARC SuperCluster. It has many great features inherited from its predecessors, and it adds several new ones. Allow me to pick out and elaborate about some of the most interesting ones from a technological point of view. I. It is the SPARC SuperCluster T4-4. That is, as compute nodes, it includes SPARC T4-4 servers that we learned to appreciate and respect for their features: The SPARC T4 CPUs: Each CPU has 8 cores, each core runs 8 threads. The SPARC T4-4 servers have 4 sockets. That is, a single compute node can in parallel, simultaneously execute 256 threads. Now, a full-rack SPARC SuperCluster has 4 of these servers on board. Remember the keyword demigod. While retaining the forerunner SPARC T3's exceptional throughput, the SPARC T4 CPUs raise the bar with single performance too - a humble 5x better one than their ancestors. actually, the SPARC T4 CPU cores run in both single-threaded and multi-threaded mode, and switch between these two on-the-fly, fulfilling not only single-threaded OR multi-threaded applications' needs, but even mixed requirements (like in database workloads!). Data security, anyone? Every SPARC T4 CPU core has a built-in encryption engine, that is, encryption algorithms cast into silicon. A PCI controller right on the chip for customers who need I/O performance. Built-in, no-cost Virtualization: Oracle VM for SPARC (the former LDoms or Logical Domains) is not a server-emulation virtualization technology but rather a serverpartitioning one, the hypervisor runs in the server firmware, and all the VMs' HW resources (I/O, CPU, memory) are accessed natively, without performance overhead. This enables customers to run a number of Solaris 10 and Solaris 11 VMs separated, independent of each other within a physical server II. For Database performance, it includes Exadata Storage Cells - one of the main reasons why the Exadata Database Machine performs at diabolic speed. What makes them important? They provide DB backend storage for your Oracle Databases to run on the SPARC SuperCluster, that is what they are built and tuned for DB performance. These storage cells are SQL-aware. That is, if a SPARC T4 database compute node executes a query, it doesn't simply request tons of raw datablocks from the storage, filters the received data, and throws away most of it where the statement doesn't apply, but provides the SQL query to the storage node too. The storage cell software speaks SQL, that is, it is able to prefilter and through that transfer only the relevant data. With this, the traffic between database nodes and storage cells is reduced immensely. Less I/O is a good thing - as they say, all the CPUs of the world do one thing just as fast as any other - and that is waiting for I/O. They don't only pre-filter, but also provide data preprocessing features - e.g. if a DB-node requests an aggregate of data, they can calculate it, and handover only the results, not the whole set. Again, less data to transfer. They support the magical HCC, (Hybrid Columnar Compression). That is, data can be stored in a precompressed form on the storage. Less data to transfer. Of course one can't simply rely on disks for performance, there is Flash Storage included there for caching. III. The low latency, high-speed backbone network: InfiniBand, that interconnects all the members with: Real High Speed: 40 Gbit/s. Full Duplex, of course. Oh, and a really low latency. RDMA. Remote Direct Memory Access. This technology allows the DB nodes to do exactly that. Remotely, directly placing SQL commands into the Memory of the storage cells. Dodging all the network-stack bottlenecks, avoiding overhead, placing requests directly into the process queue. You can also run IP over InfiniBand if you please - that's the way the compute nodes can communicate with each other. IV. Including a general-purpose storage too: the ZFSSA, which is a unified storage, providing NAS and SAN access too, with the following features: NFS over RDMA over InfiniBand. Nothing is faster network-filesystem-wise. All the ZFS features onboard, hybrid storage pools, compression, deduplication, snapshot, replication, NFS and CIFS shares Storageheads in a HA-Cluster configuration providing availability of the data DTrace Live Analytics in a web-based Administration UI Being a general purpose application data storage for your non-database applications running on the SPARC SuperCluster over whichever protocol they prefer, easily replicating, snapshotting, cloning data for them. There's a lot of great technology included in Oracle's SPARC SuperCluster, we have talked its interior through. As for external scalability: you can start with a half- of full- rack SPARC SuperCluster, and scale out to several racks - that is, stacking not separate full-rack SPARC SuperClusters, but extending always one large instance of the size of several full-racks. Yes, over InfiniBand network. Add racks as you grow. What technologies shall run on it? SPARC SuperCluster is a general purpose scaleout consolidation/cloud environment. You can run Oracle Databases with RAC scaling, or Oracle Weblogic (end enjoy the SPARC T4's advantages to run Java). Remember, Oracle technologies have been integrated with the Oracle Engineered Systems - this is the Oracle on Oracle advantage. But you can run other software environments such as SAP if you please too. Run any application that runs on Oracle Solaris 10 or Solaris 11. Separate them in Virtual Machines, or even Oracle Solaris Zones, monitor and manage those from a central UI. Here the key takeaways once again: The SPARC SuperCluster: Is a pre-integrated Engineered System Contains SPARC T4-4 servers with built-in virtualization, cryptography, dynamic threading Contains the Exadata storage cells that intelligently offload the burden of the DB-nodes Contains a highly available ZFS Storage Appliance, that provides SAN/NAS storage in a unified way Combines all these elements over a high-speed, low-latency backbone network implemented with InfiniBand Can grow from a single half-rack to several full-rack size Supports the consolidation of hundreds of applications To summarize: All these technologies are great by themselves, but the real value is like in every other Oracle Engineered System: Integration. All these technologies are tuned to perform together. Together they are way more than the sum of all - and a careful and actually very time consuming integration process is necessary to orchestrate all these for performance. The SPARC SuperCluster's goal is to enable infrastructure operations and offer a pre-integrated solution that can be architected and delivered in hours instead of months of evaluations and tests. The tedious and most importantly time and resource consuming part of the work - testing and evaluating - has been done. Now go, provide services. -- charlie

Read the article

CodePlex Daily Summary for Friday, August 22, 2014

CodePlex Daily Summary for Friday, August 22, 2014Popular ReleasesQuickMon: Version 3.22: This release add two important changes. 1. Config variables at the monitor pack level (global to entire monitor pack for all Collectors) 2. The QuickMon (Windows) service now automatically reloads monitor packs that have been changed since it was started. This means you don't have to restart the service for changes to take effect.SSIS ReportGeneratorTask: ReportGenerator Task 1.8: New version of the SSIS Report Generator Task that supports SQL Server 2008, 2012 and 2014. In addition to minor bug fixes Multi-Value Parameters and Execution Information were integrated. The complete variable and parameter assignment is now a string and can be set dynamically.Corefig for Windows Server 2012 Core and Hyper-V Server 2012: Corefig 1.1.2 ISO: FixesUpdated Hyper-V scripts to use version 2 of the WMI tree. Updated the Hyper-V check for saved VM to look for the proper identifier. Fixed text issues with the licensing tab (thanks to briangw for rooting this problem out). EnhancementsNew (and improved) version number in Corefig.psd1.Outlook 2013 Backup Add-In: Outlook Backup Add-In 1.3: Changelog for new version: Added button in config-window to reset the last backup-time (this will trigger the backup after closing outlook) Minimum interval set to 0 (backup at each closing of outlook) Catch exception when data store entry is corrupt Added two parameters (prefix and suffix) to automatically rename the backup file Updated VSTO-Runtime to 10.0.50325 Upgraded project to Visual Studio 2013 Added optional command to run after backup (e.g. pack backup files, ...) Add...babelua: 1.6.7.0: V1.6.7.0 - 2014.8.21New feature: add a file search window ( ctrl+1 or ALT+L ), like The file search in VC Assistant; Stability improvement: performance improvement when BabeLua load/unload; performance improvement when debugger load lua files;File Explorer for WPF: FileExplorer3_20August2014: Please see Aug14 Update.Open NFe: RDI Open NFe 3.0 (alpha): Atualização para o layout 3.10 da NFe.ODBC Connect: v1.0: ODBC Connect executables for both 32bit and 64bit ODBC data sourcesMSSQL Deployment Tool: Microsoft SQL Deploy Tool v1.3.1: MicrosoftSqlDeployTool: v1.3.1.38348 What's changed? Update namespace and assembly name. Bug fixing.SharePoint 2013 Search Query Tool: SharePoint 2013 Search Query Tool v2.1: Layout improvements Bug fixes Stores auth method and user name Moved experimental settings to Advanced boxCtrlAltStudio Viewer: CtrlAltStudio Viewer 1.2.2.41183 Alpha: This alpha of the CtrlAltStudio Viewer provides some preliminary Oculus Rift DK2 support. For more details, see the release notes linked to below. Release notes: http://ctrlaltstudio.com/viewer/release-notes/1-2-2-41183-alpha Support info: http://ctrlaltstudio.com/viewer/support Privacy policy: http://ctrlaltstudio.com/viewer/privacy Disclaimer: This software is not provided or supported by Linden Lab, the makers of Second Life.HDD Guardian: HDD Guardian 0.6.1: New: package now include smartctl 6.3; Removed: standard notification e-mail. Now you have to set your mail server to send e-mail alerts; Bugfix: USB detection error; custom e-mail server settings issue; bottom panel displays a wrong ATA error count.VG-Ripper & PG-Ripper: VG-Ripper 2.9.62: changes NEW: Added Support for 'MadImage.org' links NEW: Added Support for 'ImgSpot.org' links NEW: Added Support for 'ImgClick.net' links NEW: Added Support for 'Imaaage.com' links NEW: Added Support for 'Image-Bugs.com' links NEW: Added Support for 'Pictomania.org' links NEW: Added Support for 'ImgDap.com' links NEW: Added Support for 'FileSpit.com' links FIXED: 'ImgSee.me' linksMagick.NET: Magick.NET 7.0.0.0001: Magick.NET linked with ImageMagick 7-Beta.CMake Tools for Visual Studio: CMake Tools for Visual Studio 1.2: This release adds the following new features and bug fixes from CMake Tools for Visual Studio 1.1: Added support for CMake 3.0. Added support for word completion. Added IntelliSense support for the CMAKEHOSTSYSTEM_INFORMATION command. Fixed syntax highlighting for tokens beginning with escape sequences. Fixed issue uninstalling CMake Tools for Visual Studio after Visual Studio has been uninstalled.GW2 Personal Assistant Overlay: GW2 Personal Assistant Overlay 1.1: Overview1.1 is the second 'stable' release of the GW2 Personal Assistant Overlay. This version includes just a couple of very minor features and some minor bug fixes. For details regarding installation, setup, and general use, see Documentation. Note: If you were using a previous version, you will probably want to copy over the following user settings files: GW2PAO.DungeonSettings.xml GW2PAO.EventSettings.xml GW2PAO.WvWSettings.xml GW2PAO.ZoneCompletionSettings.xml New FeaturesAdded new "No...Fluentx: Fluentx v1.5.3: Added few more extension methods.fastJSON: v2.1.2: 2.1.2 - bug fix circular referencesJPush.NET: JPush Server SDK 1.2.1 (For JPush V3): Assembly: 1.2.1.24728 JPush REST API Version: v3 JPush Documentation Reference .NET framework: v4.0 or above. Sample: class: JPushClientV3 2014 Augest 15th.SEToolbox: SEToolbox 01.043.008 Release 1: Changed ship/station names to use new DisplayName instead of Beacon/Antenna. Fixed issue with updated SE binaries 01.043.018 using new Voxel Material definitions.New Projects1thManage: GDT for erevery oneCreateProjectOnCodePlex: This is the first project for CoderCamps.HEAD FIRST C# LAB 1 : A DAY AT THE RACES: This has been provided for educational purposes and general discussion to improve coding practices associated with the resources detailed within Head First C#.Introduce Audit logging to your EF application using Repository & Unit of Work: Introduce Auditing in your application that uses Entity Framework by utilizing the Repository and Unit of Work design patterns.License Registration (C++): Allow to create demo version, activate or not a module.MS Word SharepointWiki Plugin: Scope of the Plugin is to enable a Post to a Sharepoint Wiki from within MS Word with Formatted Text and Images.Send My Zip: This app will help you to send the files were zipped then send the email about password information. This project is currently in setup mode and only availablewinhttp: this is a project for http/https download.Wix Builder: WixBuilder focusses on easily generating a WiX script from a project ouput, compile and link it into msi installer using the WiX Toolset.XiamiSig: ????????。

Read the article

BRE (Business Rules Engine) Data Services is out...!!!

- by Vishal

A few months ago we at Tellago had open sourced the BizTalk Data Services. We were meanwhile working on other artifacts which comes along with BizTalk Server like the “Business Rules Engine”. We are happy to announce the first version of BRE Data Services. BRE Data Services is a same concept which we covered through BTS Data Services, providing a RESTFul OData – based API to interact with the Business Rules Engine via HTTP using ATOM Publishing Protocol or JSON as the encoding mechanism. In the first version release, we mainly focused on the browsing, querying and searching BRE artifacts via a RESTFul interface. Also along with that we provide the functionality to execute Business Rules by inserting the Facts for policies via the IUpdatable implementation of WCF Data Services. The BRE Data Services API provides a lightweight interface for managing Business Rules Engine artifacts such as Policies, Rules, Vocabularies, Conditions, Actions, Facts etc. The following are some examples which details some of the available features in the current version of the API. Basic Querying: Querying BRE Policies http://localhost/BREDataServices/BREMananagementService.svc/Policies Querying BRE Rules http://localhost/BREDataServices/BREMananagementService.svc/Rules Querying BRE Vocabularies http://localhost/BREDataServices/BREMananagementService.svc/Vocabularies Navigation: The BRE Data Services API also leverages WCF Data Services to enable navigation across related different BRE objects. Querying a specific Policy http://localhost/BREDataServices/BREMananagementService.svc/Policies(‘PolicyName’) Querying a specific Rule http://localhost/BREDataServices/BREMananagementService.svc/Rules(‘RuleName’) Querying all Rules under a Policy http://localhost/BREDataServices/BREMananagementService.svc/Policies('PolicyName')/Rules Querying all Facts under a Policy http://localhost/BREDataServices/BREMananagementService.svc/Policies('PolicyName')/Facts Querying all Actions for a specific Rule http://localhost/BREDataServices/BREMananagementService.svc/Rules('RuleName')/Actions Querying all Conditions for a specific Rule http://localhost/BREDataServices/BREMananagementService.svc/Rules('RuleName')/Actions Querying a specific Vocabulary: http://localhost/BREDataServices/BREMananagementService.svc/Vocabularies('VocabName') Implementation: With the BRE Data Services, we also provide the functionality of executing a particular policy via HTTP. There are couple of ways you can do that though the API. Ø First is though Service Operations feature of WCF Data Services in which you can execute the Facts by passing them in the URL itself. This is a very simple implementations of the executing the policies due to the limitations & restrictions (only primitive types of input parameters which can be passed) currently of the Service Operations of the WCF Data Services. Below is a code sample. Below is a traced Request/Response message. Ø Second is through the IUpdatable Interface of WCF Data Services. In this method, you can first query the rule which you want to execute and then inserts Facts for that particular Rules and finally when you perform the SaveChanges() call for the IUpdatable Interface API, it executes the policy with the facts which you inserted at runtime. Below is a sample of client side code. Due to the limitations of current version of WCF Data Services where there is no way you can return back the updates happening on the service side back to the client via the SaveChanges() method. Here we are executing the rule passing a serialized XML as Facts and there is no changes made to any data where we can query back to fetch the changes. This is overcome though the first way to executing the policies which is by executing it as a Service Operation call. This actually generates a AtomPub message shown as below: POST /Tellago.BRE.REST.ServiceHost/BREMananagementService.svc/$batch HTTP/1.1 User-Agent: Microsoft ADO.NET Data Services DataServiceVersion: 1.0;NetFx MaxDataServiceVersion: 2.0;NetFx Accept: application/atom+xml,application/xml Accept-Charset: UTF-8 Content-Type: multipart/mixed; boundary=batch_6b9a5ced-5ecb-4585-940a-9d5e704c28c7 Host: localhost:8080 Content-Length: 1481 Expect: 100-continue --batch_6b9a5ced-5ecb-4585-940a-9d5e704c28c7 Content-Type: multipart/mixed; boundary=changeset_184a8c59-a714-4ba9-bb3d-889a88fe24bf --changeset_184a8c59-a714-4ba9-bb3d-889a88fe24bf Content-Type: application/http Content-Transfer-Encoding: binary MERGE http://localhost:8080/Tellago.BRE.REST.ServiceHost/BREMananagementService.svc/Facts('TestPolicy') HTTP/1.1 Content-ID: 4 Content-Type: application/atom+xml;type=entry Content-Length: 927 <?xml version="1.0" encoding="utf-8" standalone="yes"?> <entry xmlns:d="http://schemas.microsoft.com/ado/2007/08/dataservices" xmlns:m="http://schemas.microsoft.com/ado/2007/08/dataservices/metadata" font-size: x-small"http://www.w3.org/2005/Atom"> <category scheme="http://schemas.microsoft.com/ado/2007/08/dataservices/scheme" term="Tellago.BRE.REST.Resources.Fact" /> <title /> <author> <name /> </author> <updated>2011-01-31T20:09:15.0023982Z</updated> <id>http://localhost:8080/Tellago.BRE.REST.ServiceHost/BREMananagementService.svc/Facts('TestPolicy')</id> <content type="application/xml"> <m:properties> <d:FactInstance><ns0:LoanStatus xmlns:ns0="http://tellago.com"><Age>10</Age><Status>true</Status></ns0:LoanStatus></d:FactInstance> <d:FactType>TestSchema</d:FactType> <d:ID>TestPolicy</d:ID> </m:properties> </content> </entry> --changeset_184a8c59-a714-4ba9-bb3d-889a88fe24bf-- --batch_6b9a5ced-5ecb-4585-940a-9d5e704c28c7— Installation: The installation of the BRE Data Services is pretty straight forward. · Create a new IIS website say BREDataServices. · Download the SourceCode from TellagoCodeplex and copy the content from Tellago.BRE.REST.ServiceHost to the physical location of the above created website. · The appPool account running the website should have admin access to the BizTalkRuleEngineDb database. · TheRight click the BREManagementService.svc in the IIS ContentView for the website and wala.. Conclusion: The BRE Data Services API is an experiment intended to bring the capabilities of RESTful/OData based services to the Traditional BTS/BRE Solutions. The future releases will target on technologies like BAM, ESB Toolkit. This version has been tested with various version of BizTalk Server and we have uploaded the source code to our Tellago's DevLabs workspace at Codeplex. I hope you guys enjoy this release. Keep an eye on our new releases @ Tellago Codeplex. We are working on various other Biztalk Artifacts like BAM, ESB Toolkit. Till than happy BizzRuling…!!! Thanks, Vishal Mody

Read the article

Execution plan warnings–The final chapter

- by Dave Ballantyne

In my previous posts (here and here), I showed examples of some of the execution plan warnings that have been added to SQL Server 2012. There is one other warning that is of interest to me : “Unmatched Indexes”. Firstly, how do I know this is the final one ? The plan is an XML document, right ? So that means that it can have an accompanying XSD. As an XSD is a schema definition, we can poke around inside it to find interesting things that *could* be in the final XML file. The showplan schema is stored in the folder Microsoft SQL Server\110\Tools\Binn\schemas\sqlserver\2004\07\showplan and by comparing schemas over releases you can get a really good idea of any new functionality that has been added. Here is the section of the Sql Server 2012 showplan schema that has been interesting me so far : <xsd:complexType name="AffectingConvertWarningType"> <xsd:annotation> <xsd:documentation>Warning information for plan-affecting type conversion</xsd:documentation> </xsd:annotation> <xsd:sequence>  </xsd:sequence> <xsd:attribute name="ConvertIssue" use="required"> <xsd:simpleType> <xsd:restriction base="xsd:string"> <xsd:enumeration value="Cardinality Estimate" /> <xsd:enumeration value="Seek Plan" />  </xsd:restriction> </xsd:simpleType> </xsd:attribute> <xsd:attribute name="Expression" type ="xsd:string" use="required" /></xsd:complexType><xsd:complexType name="WarningsType"> <xsd:annotation> <xsd:documentation>List of all possible iterator or query specific warnings (e.g. hash spilling, no join predicate)</xsd:documentation> </xsd:annotation> <xsd:choice minOccurs="1" maxOccurs="unbounded"> <xsd:element name="ColumnsWithNoStatistics" type="shp:ColumnReferenceListType" minOccurs="0" maxOccurs="1" /> <xsd:element name="SpillToTempDb" type="shp:SpillToTempDbType" minOccurs="0" maxOccurs="unbounded" /> <xsd:element name="Wait" type="shp:WaitWarningType" minOccurs="0" maxOccurs="unbounded" /> <xsd:element name="PlanAffectingConvert" type="shp:AffectingConvertWarningType" minOccurs="0" maxOccurs="unbounded" /> </xsd:choice> <xsd:attribute name="NoJoinPredicate" type="xsd:boolean" use="optional" /> <xsd:attribute name="SpatialGuess" type="xsd:boolean" use="optional" /> <xsd:attribute name="UnmatchedIndexes" type="xsd:boolean" use="optional" /> <xsd:attribute name="FullUpdateForOnlineIndexBuild" type="xsd:boolean" use="optional" /></xsd:complexType> I especially like the “to be extended here” comment, high hopes that we will see more of these in the future. So “Unmatched Indexes” was a warning that I couldn’t get and many thanks must go to Fabiano Amorim (b|t) for showing me the way. Filtered indexes were introduced in Sql Server 2008 and are really useful if you only need to index only a portion of the data within a table. However, if your SQL code uses a variable as a predicate on the filtered data that matches the filtered condition, then the filtered index cannot be used as, naturally, the value in the variable may ( and probably will ) change and therefore will need to read data outside the index. As an aside, you could use option(recompile) here , in which case the optimizer will build a plan specific to the variable values and use the filtered index, but that can bring about other problems. To demonstrate this warning, we need to generate some test data : DROP TABLE #TestTab1GOCREATE TABLE #TestTab1 (Col1 Int not null, Col2 Char(7500) not null, Quantity Int not null)GOINSERT INTO #TestTab1 VALUES (1,1,1),(1,2,5),(1,2,10),(1,3,20), (2,1,101),(2,2,105),(2,2,110),(2,3,120)GO and then add a filtered index CREATE INDEX ixFilter ON #TestTab1 (Col1)WHERE Quantity = 122 Now if we execute SELECT COUNT(*) FROM #TestTab1 WHERE Quantity = 122 We will see the filtered index being scanned But if we parameterize the query DECLARE @i INT = 122SELECT COUNT(*) FROM #TestTab1 WHERE Quantity = @i The plan is very different a table scan, as the value of the variable used in the predicate can change at run time, and also we see the familiar warning triangle. If we now look at the properties pane, we will see two pieces of information “Warnings” and “UnmatchedIndexes”. So, handily, we are being told which filtered index is not being used due to parameterization.

Read the article

Entity Framework 6: Alpha2 Now Available

- by ScottGu

The Entity Framework team recently announced the 2nd alpha release of EF6. The alpha 2 package is available for download from NuGet. Since this is a pre-release package make sure to select “Include Prereleases” in the NuGet package manager, or execute the following from the package manager console to install it: PM> Install-Package EntityFramework -Pre This week’s alpha release includes a bunch of great improvements in the following areas: Async language support is now available for queries and updates when running on .NET 4.5. Custom conventions now provide the ability to override the default conventions that Code First uses for mapping types, properties, etc. to your database. Multi-tenant migrations allow the same database to be used by multiple contexts with full Code First Migrations support for independently evolving the model backing each context. Using Enumerable.Contains in a LINQ query is now handled much more efficiently by EF and the SQL Server provider resulting greatly improved performance. All features of EF6 (except async) are available on both .NET 4 and .NET 4.5. This includes support for enums and spatial types and the performance improvements that were previously only available when using .NET 4.5. Start-up time for many large models has been dramatically improved thanks to improved view generation performance. Below are some additional details about a few of the improvements above: Async Support .NET 4.5 introduced the Task-Based Asynchronous Pattern that uses the async and await keywords to help make writing asynchronous code easier. EF 6 now supports this pattern. This is great for ASP.NET applications as database calls made through EF can now be processed asynchronously – avoiding any blocking of worker threads. This can increase scalability on the server by allowing more requests to be processed while waiting for the database to respond. The following code shows an MVC controller that is querying a database for a list of location entities: public class HomeController : Controller { LocationContext db = new LocationContext(); public async Task<ActionResult> Index() { var locations = await db.Locations.ToListAsync(); return View(locations); } } Notice above the call to the new ToListAsync method with the await keyword. When the web server reaches this code it initiates the database request, but rather than blocking while waiting for the results to come back, the thread that is processing the request returns to the thread pool, allowing ASP.NET to process another incoming request with the same thread. In other words, a thread is only consumed when there is actual processing work to do, allowing the web server to handle more concurrent requests with the same resources. A more detailed walkthrough covering async in EF is available with additional information and examples. Also a walkthrough is available showing how to use async in an ASP.NET MVC application. Custom Conventions When working with EF Code First, the default behavior is to map .NET classes to tables using a set of conventions baked into EF. For example, Code First will detect properties that end with “ID” and configure them automatically as primary keys. However, sometimes you cannot or do not want to follow those conventions and would rather provide your own. For example, maybe your primary key properties all end in “Key” instead of “Id”. Custom conventions allow the default conventions to be overridden or new conventions to be added so that Code First can map by convention using whatever rules make sense for your project. The following code demonstrates using custom conventions to set the precision of all decimals to 5. As with other Code First configuration, this code is placed in the OnModelCreating method which is overridden on your derived DbContext class: protected override void OnModelCreating(DbModelBuilder modelBuilder) { modelBuilder.Properties<decimal>() .Configure(x => x.HasPrecision(5)); } But what if there are a couple of places where a decimal property should have a different precision? Just as with all the existing Code First conventions, this new convention can be overridden for a particular property simply by explicitly configuring that property using either the fluent API or a data annotation. A more detailed description of custom code first conventions is available here. Community Involvement I blogged a while ago about EF being released under an open source license. Since then a number of community members have made contributions and these are included in EF6 alpha 2. Two examples of community contributions are: AlirezaHaghshenas contributed a change that increases the startup performance of EF for larger models by improving the performance of view generation. The change means that it is less often necessary to use of pre-generated views. UnaiZorrilla contributed the first community feature to EF: the ability to load all Code First configuration classes in an assembly with a single method call like the following: protected override void OnModelCreating(DbModelBuilder modelBuilder) { modelBuilder.Configurations .AddFromAssembly(typeof(LocationContext).Assembly); } This code will find and load all the classes that inherit from EntityTypeConfiguration<T> or ComplexTypeConfiguration<T> in the assembly where LocationContext is defined. This reduces the amount of coupling between the context and Code First configuration classes, and is also a very convenient shortcut for large models. Other upcoming features coming in EF 6 Lots of information about the development of EF6 can be found on the EF CodePlex site, including a roadmap showing the other features that are planned for EF6. One of of the nice upcoming features is connection resiliency, which will automate the process of retying database operations on transient failures common in cloud environments and with databases such as the Windows Azure SQL Database. Another often requested feature that will be included in EF6 is the ability to map stored procedures to query and update operations on entities when using Code First. Summary EF6 is the first open source release of Entity Framework being developed in CodePlex. The alpha 2 preview release of EF6 is now available on NuGet, and contains some really great features for you to try. The EF team are always looking for feedback from developers - especially on the new features such as custom Code First conventions and async support. To provide feedback you can post a comment on the EF6 alpha 2 announcement post, start a discussion or file a bug on the CodePlex site. Hope this helps, Scott P.S. In addition to blogging, I am also now using Twitter for quick updates and to share links. Follow me at: twitter.com/scottgu

Read the article

What's new in EJB 3.2 ? - Java EE 7 chugging along!

- by arungupta

EJB 3.1 added a whole ton of features for simplicity and ease-of-use such as @Singleton, @Asynchronous, @Schedule, Portable JNDI name, EJBContainer.createEJBContainer, EJB 3.1 Lite, and many others. As part of Java EE 7, EJB 3.2 (JSR 345) is making progress and this blog will provide highlights from the work done so far. This release has been particularly kept small but include several minor improvements and tweaks for usability. More features in EJB.Lite Asynchronous session bean Non-persistent EJB Timer service This also means these features can be used in embeddable EJB container and there by improving testability of your application. Pruning - The following features were made Proposed Optional in Java EE 6 and are now made optional. EJB 2.1 and earlier Entity Bean Component Contract for CMP and BMP Client View of an EJB 2.1 and earlier Entity Bean EJB QL: Query Language for CMP Query Methods JAX-RPC-based Web Service Endpoints and Client View The optional features are moved to a separate document and as a result EJB specification is now split into Core and Optional documents. This allows the specification to be more readable and better organized. Updates and Improvements Transactional lifecycle callbacks in Stateful Session Beans, only for CMT. In EJB 3.1, the transaction context for lifecyle callback methods (@PostConstruct, @PreDestroy, @PostActivate, @PrePassivate) are defined as shown. @PostConstruct @PreDestroy @PrePassivate @PostActivate Stateless Unspecified Unspecified N/A N/A Stateful Unspecified Unspecified Unspecified Unspecified Singleton Bean's transaction management type Bean's transaction management type N/A N/A In EJB 3.2, stateful session bean lifecycle callback methods can opt-in to be transactional. These methods are then executed in a transaction context as shown. @PostConstruct @PreDestroy @PrePassivate @PostActivate Stateless Unspecified Unspecified N/A N/A Stateful Bean's transaction management type Bean's transaction management type Bean's transaction management type Bean's transaction management type Singleton Bean's transaction management type Bean's transaction management type N/A N/A For example, the following stateful session bean require a new transaction to be started for @PostConstruct and @PreDestroy lifecycle callback methods. @Statefulpublic class HelloBean { @PersistenceContext(type=PersistenceContextType.EXTENDED) private EntityManager em; @TransactionAttribute(TransactionAttributeType.REQUIRES_NEW) @PostConstruct public void init() { myEntity = em.find(...); } @TransactionAttribute(TransactionAttributeType.REQUIRES_NEW) @PostConstruct public void destroy() { em.flush(); }} Notice, by default the lifecycle callback methods are not transactional for backwards compatibility. They need to be explicitly opt-in to be made transactional. Opt-out of passivation for stateful session bean - If your stateful session bean needs to stick around or it has non-serializable field then the bean can be opt-out of passivation as shown. @Stateful(passivationCapable=false)public class HelloBean { private NonSerializableType ref = ... . . .} Simplified the rules to define all local/remote views of the bean. For example, if the bean is defined as: @Statelesspublic class Bean implements Foo, Bar { . . .} where Foo and Bar have no annotations of their own, then Foo and Bar are exposed as local views of the bean. The bean may be explicitly marked @Local as @Local@Statelesspublic class Bean implements Foo, Bar { . . .} then this is the same behavior as explained above, i.e. Foo and Bar are local views. If the bean is marked @Remote as: @Remote@Statelesspublic class Bean implements Foo, Bar { . . .} then Foo and Bar are remote views. If an interface is marked @Local or @Remote then each interface need to be explicitly marked explicitly to be exposed as a view. For example: @Remotepublic interface Foo { . . . }@Statelesspublic class Bean implements Foo, Bar { . . .} only exposes one remote interface Foo. Section 4.9.7 from the specification provide more details about this feature. TimerService.getAllTimers is a newly added convenience API that returns all timers in the same bean. This is only for displaying the list of timers as the timer can only be canceled by its owner. Removed restriction to obtain the current class loader, and allow to use java.io package. This is handy if you want to do file access within your beans. JMS 2.0 alignment - A standard list of activation-config properties is now defined destinationLookup connectionFactoryLookup clientId subscriptionName shareSubscriptions Tons of other clarifications through out the spec. Appendix A provide a comprehensive list of changes since EJB 3.1. ThreadContext in Singleton is guaranteed to be thread-safe. Embeddable container implement Autocloseable. A complete replay of Enterprise JavaBeans Today and Tomorrow from JavaOne 2012 can be seen here (click on CON4654_mp4_4654_001 in Media). The specification is still evolving so the actual property or method names or their actual behavior may be different from the currently proposed ones. Are there any improvements that you'd like to see in EJB 3.2 ? The EJB 3.2 Expert Group would love to hear your feedback. An Early Draft of the specification is available. The latest version of the specification can always be downloaded from here. Java EE 7 Specification Status EJB Specification Project JIRA of EJB Specification JSR Expert Group Discussion Archive These features will start showing up in GlassFish 4 Promoted Builds soon.

Read the article

JPA 2?EJB 3.1?JSF 2????????! WebLogic Server 12c?????????Java EE 6??????|WebLogic Channel|??????

- by ???02

2012?2???????????????WebLogic Server 12c?????????Java EE 6?????????????????????????????????????????????????????????????Oracle Enterprise Pack for Eclipse 12c??WebLogic Server 12c(???)????Java EE 6??????3??????????????????????????????JPA 2.0??????????·?????????EJB 3.1???????·???????????????(???)???????O/R?????????????JPA 2.0　Java EE 6????????????????????Web?????????????3?????(3????)???????·????????????·????????????????????????????????JPA(Java Persistence API) 2.0???EJB(Enterprise JavaBeans) 3.1???JSF(JavaServer Faces) 2.0????3????????????????·???????????JPA??Java??????????????·?????????????O/R?????????????????????·???????????EJB?Session Bean??????????????????·??????????????????????JSF???????????????????????????????????????　??????JPA????Oracle Database??EMPLOYEES?????Java??????????????????????Entity Bean??????XML?????????????????????????XML????????????????????????????????????????????????????·?????????????????????????????????????????????????????????????Java EE 6??????JPA 2.0??????????·????????????????????????????????????????????????????????????????????????????????????????　?????????????????????????????????????????????????????????????Oracle Enterprise Pack for Eclipse(OEPE)??????File????????New?-?Other???????　??????New??????????????????????????Web?-?Dynamic Web Project???????Next????????????????Dynamic Web Project?????????????Project name????OOW???????????Target Runtime????New Runtime?????????　???New Server Runtime Environment???????????????Oracle?-?Oracle WebLogic Server 12c(12.1.1)???????Next???????????????????????????WebLogic home????C:\Oracle\Middleware\wlserver_12.1???????Finish?????????????WebLogic Home????????????????????????Java home?????????????????????Finish??????????????????????Dynamic Web Project????????????????Finish??????????????????JPA 2.0??????????·??????　???????????????JPA 2.0???????????????·??????????????????Eclipse??Project Explorer?(??????·???)?????????OOW?????????????????????????????·???????????????Properties?????????????????·???·????????????????????????????Project Facets?????????????JPA??????(?????????????Details?????JPA 2.0?????????????????????)???????????????????Further configuration available?????????　???Modify Faceted Project??????????????????????????????????Connection????????????????????????????Add Connection?????????　??????New Connection Profile????????????????Connection Profile Type????Oracle Database Connection??????Next????????????　???Specify a Driver and Connection Details???????Drivers????Oracle Database 10g Driver Default???????????Properties?????????????????????SIDxeHostlocalhostPort number1521User nameHRPasswordhr　???????????Test Connection??????????????????Ping Succeeded!?????????????????????????????Finish???????????Modify Faceted Project????????OK????????????????Properties for OOW????????OK??????????????????　?????????Eclipse????????????????OOW?????????????????·???????????????JPA Tools?-?Generate Entities from Tables...???????　????Generate Custom Entities???????????????????????????????Schema????HR??????Tables????EMPLOYEES???????????Next????????????　???????????Next???????????Customize Default Entity Generation??????Package????model???????Finish?????????????JPQL??????????　?????????Oracle Database??EMPLOYEES??????????????????·????model.Employee.java?????????????????????????????????·?????OOW????Java Resources?-?src?-?model???????Employee.java????????????????????????????????·???Employee????(Employee.java)?package model; import java.io.Serializable; import java.math.BigDecimal; import java.util.Date; import java.util.Set; import javax.persistence.Column;<...?...>/** * The persistent class for the EMPLOYEES database table. * */ @Entity // ?@Table(name="EMPLOYEES") // ?// Apublic class Employee implements Serializable { 　　　　　　　private static final long serialVersionUID = 1L; 　　　　　　@Id // ?　　　　　　　@Column(name="EMPLOYEE_ID") 　　　　　　　private long employeeId; 　　　　　　　@Column(name="COMMISSION_PCT") 　　　　　　　private BigDecimal commissionPct; 　　　　　　　@Column(name="DEPARTMENT_ID") 　　　　　　　private BigDecimal departmentId; 　　　　　　　private String email; 　　　　　　　@Column(name="FIRST_NAME") 　　　　　　　private String firstName; 　　　　　　@Temporal( TemporalType.DATE) //?　　　　　　　@Column(name="HIRE_DATE") 　　　　　　　private Date hireDate; 　　　　　　　@Column(name="JOB_ID") 　　　　　　　private String jobId; 　　　　　　　@Column(name="LAST_NAME") 　　　　　　　private String lastName; 　　　　　　　@Column(name="PHONE_NUMBER") 　　　　　　　private String phoneNumber; 　　　　　　　private BigDecimal salary; 　　　　　　　//bi-directional many-to-one association to Employee<...?...>} 　???????????????·???????????????????????????????????????????@Table(name="")??????@Table???????????????????????????????????????　?????????????????????????????????????·????????????????　?????????????????????????????SQL?Data??????????　???????????????A?????JPA?????????JPQL(Java Persistence Query Language)?????????????JPQL?????SQL???????????????????????????????????????????????????????????????????????????????????Employee.selectByNameEmployee??firstName????????????????????employeeId?????????　?????????????????????import java.util.Date;import java.util.Set;import javax.persistence.Column;<...?...>/** * The persistent class for the EMPLOYEES database table. * */ @Entity // ?@Table(name="EMPLOYEES") // ?@NamedQueries({　　　　　　　@NamedQuery(name="Employee.selectByName" , query="select e from Employee e where e.firstName like :name order by e.employeeId")})<...?...>　?????????·??????OOW?-?JPA Content?-?persistent.xml??????Connection???????????????Database????JTA data source:???jdbc/test????????????????????????Java EE 6??????JPA 2.0???????????????????????????????????·??????????????????????????????????????SQL????????????????????????·????????????·??????????????XML??????????????????1??????????????????????????????????????????????????????????????????EJB 3.1????????·???????????EJB 3.1????????·?????????????????EJB 3.1?Stateless Session Bean?????·????????????????·???????????????????·???????????????????　EJB3.1?????JPA 2.0???????????·???????????????????????XML???????????????????????????????EJB 3.1?????????·????EJB??????????????????????????????????????????????????????????????　????????EJB 3.1?Session Bean?????·????????????????????????????????????????????????????public List<Employee> getEmp(String keyword)firstName????????????Employee??????　????????????????????·???????????OOW????????????·???????????????New?-?Other???????????????????????????????????EJB?-?Session Bean(EJB 3.x)??????NEXT????????????????????Create EJB 3.x Session Bean?????????????Java Package????ejb???class name????EmpLogic???????????State Type????Stateless?????????No-interface???????????????????????Finish????????????　?????????Stateless Session Bean??????·?????EmpLogic.java????????????????????EmpLogic????·????????EJB?????????????Stateless Session Bean?????????@Stateless?????????????????????????????????????EmpLogic????(EmpLogic.java)?package ejb;import javax.ejb.LocalBean;import javax.ejb.Stateless;<...?...>import model.Employee;@Stateless@LocalBeanpublic class EmpLogic {　　　　　　　public EmpLogic() {　　　　　　　}}　??????????????????????????????????????·???????????????????????import??????????????????EmpLogic??????????????????????????·???????????????????????import????????(EmpLogic.java)?package ejb;import javax.ejb.LocalBean;import javax.ejb.Stateless;import javax.persistence.EntityManager; // ?import javax.persistence.PersistenceContext; // ?<...?...>import model.Employee;@Stateless@LocalBeanpublic class EmpLogic {　　　　　　@PersistenceContext(unitName = "OOW") // ?　　　　　　private EntityManager em; // ?　　　　　　　public EmpLogic() {　　　　　　　}}　?????????·???????JPA???????????????????·????????????????????????????CRUD???????????????????·????????????EntityManager???????????????????????????1????????????????·???????????????????????@PersistenceContext?????unitName?????????????persistence.xml????persistence-unit???name??????????????　???????EmpLogic?????·???????????????????????????????????????????????????????????????????????????????EmpLogic????????·???????(EmpLogic.java)?package ejb;import java.util.List; // ? import javax.ejb.LocalBean;import javax.ejb.Stateless;import javax.persistence.EntityManager; // ? import javax.persistence.PersistenceContext; // ? <...?...>import model.Employee;@Stateless@LocalBeanpublic class EmpLogic {　　　　　　　@PersistenceContext(unitName = "OOW") // ? 　　　　　　　private EntityManager em; // ? 　　　　　　　public EmpLogic() {　　　　　　　}　　　　　　@SuppressWarnings("unchecked") // ?　　　　　　public List<Employee> getEmp(String keyword) { // ?　　　　　　　　　　　　　StringBuilder param = new StringBuilder(); // ?　　　　　　　　　　　　　param.append("%"); // ?　　　　　　　　　　　　　param.append(keyword); // ?　　　　　　　　　　　　　param.append("%"); // ?　　　　　　　　　　　　　return em.createNamedQuery("Employee.selectByName") // ?　　　　　　　　　　　　　　　　　　　　.setParameter("name", param.toString()).getResultList(); // ?　　　　　　}}　???EJB 3.1???Stateless Session Bean??????????　???JSF 2.0???????????????????????????????????????????????????JAX-RS????RESTful?Web??????????????????????

Read the article

Mulitple full joins in Postgres is slow

- by blast83

I have a program to use the IMDB database and am having very slow performance on my query. It appears that it doesn't use my where condition until after it materializes everything. I looked around for hints to use but nothing seems to work. Here is my query: SELECT * FROM name as n1 FULL JOIN aka_name ON n1.id = aka_name.person_id FULL JOIN cast_info as t2 ON n1.id = t2.person_id FULL JOIN person_info as t3 ON n1.id = t3.person_id FULL JOIN char_name as t4 ON t2.person_role_id = t4.id FULL JOIN role_type as t5 ON t2.role_id = t5.id FULL JOIN title as t6 ON t2.movie_id = t6.id FULL JOIN aka_title as t7 ON t6.id = t7.movie_id FULL JOIN complete_cast as t8 ON t6.id = t8.movie_id FULL JOIN kind_type as t9 ON t6.kind_id = t9.id FULL JOIN movie_companies as t10 ON t6.id = t10.movie_id FULL JOIN movie_info as t11 ON t6.id = t11.movie_id FULL JOIN movie_info_idx as t19 ON t6.id = t19.movie_id FULL JOIN movie_keyword as t12 ON t6.id = t12.movie_id FULL JOIN movie_link as t13 ON t6.id = t13.linked_movie_id FULL JOIN link_type as t14 ON t13.link_type_id = t14.id FULL JOIN keyword as t15 ON t12.keyword_id = t15.id FULL JOIN company_name as t16 ON t10.company_id = t16.id FULL JOIN company_type as t17 ON t10.company_type_id = t17.id FULL JOIN comp_cast_type as t18 ON t8.status_id = t18.id WHERE n1.id = 2003 Very table is related to each other on the join via foreign-key constraints and have indexes for all the mentioned columns. The query plan details: "Hash Left Join (cost=5838187.01..13756845.07 rows=15579622 width=835) (actual time=146879.213..146891.861 rows=20 loops=1)" " Hash Cond: (t8.status_id = t18.id)" " -> Hash Left Join (cost=5838185.92..13542624.18 rows=15579622 width=822) (actual time=146879.199..146891.833 rows=20 loops=1)" " Hash Cond: (t10.company_type_id = t17.id)" " -> Hash Left Join (cost=5838184.83..13328403.29 rows=15579622 width=797) (actual time=146879.165..146891.781 rows=20 loops=1)" " Hash Cond: (t10.company_id = t16.id)" " -> Hash Left Join (cost=5828372.95..10061752.03 rows=15579622 width=755) (actual time=146426.483..146429.756 rows=20 loops=1)" " Hash Cond: (t12.keyword_id = t15.id)" " -> Hash Left Join (cost=5825164.23..6914088.45 rows=15579622 width=731) (actual time=146372.411..146372.529 rows=20 loops=1)" " Hash Cond: (t13.link_type_id = t14.id)" " -> Merge Left Join (cost=5825162.82..6699867.24 rows=15579622 width=715) (actual time=146372.366..146372.472 rows=20 loops=1)" " Merge Cond: (t6.id = t13.linked_movie_id)" " -> Merge Left Join (cost=5684009.29..6378956.77 rows=15579622 width=699) (actual time=144019.620..144019.711 rows=20 loops=1)" " Merge Cond: (t6.id = t12.movie_id)" " -> Merge Left Join (cost=5182403.90..5622400.75 rows=8502523 width=687) (actual time=136849.731..136849.809 rows=20 loops=1)" " Merge Cond: (t6.id = t19.movie_id)" " -> Merge Left Join (cost=4974472.00..5315778.48 rows=8502523 width=637) (actual time=134972.032..134972.099 rows=20 loops=1)" " Merge Cond: (t6.id = t11.movie_id)" " -> Merge Left Join (cost=1830064.81..2033131.89 rows=1341632 width=561) (actual time=63784.035..63784.062 rows=2 loops=1)" " Merge Cond: (t6.id = t10.movie_id)" " -> Nested Loop Left Join (cost=1417360.29..1594294.02 rows=1044480 width=521) (actual time=59279.246..59279.264 rows=1 loops=1)" " Join Filter: (t6.kind_id = t9.id)" " -> Merge Left Join (cost=1417359.22..1429787.34 rows=1044480 width=507) (actual time=59279.222..59279.224 rows=1 loops=1)" " Merge Cond: (t6.id = t8.movie_id)" " -> Merge Left Join (cost=1405731.84..1414378.65 rows=1044480 width=491) (actual time=59121.773..59121.775 rows=1 loops=1)" " Merge Cond: (t6.id = t7.movie_id)" " -> Sort (cost=1346206.04..1348817.24 rows=1044480 width=416) (actual time=58095.230..58095.231 rows=1 loops=1)" " Sort Key: t6.id" " Sort Method: quicksort Memory: 17kB" " -> Hash Left Join (cost=172406.29..456387.53 rows=1044480 width=416) (actual time=57969.371..58095.208 rows=1 loops=1)" " Hash Cond: (t2.movie_id = t6.id)" " -> Hash Left Join (cost=104700.38..256885.82 rows=1044480 width=358) (actual time=49981.493..50006.303 rows=1 loops=1)" " Hash Cond: (t2.role_id = t5.id)" " -> Hash Left Join (cost=104699.11..242522.95 rows=1044480 width=343) (actual time=49981.441..50006.250 rows=1 loops=1)" " Hash Cond: (t2.person_role_id = t4.id)" " -> Hash Left Join (cost=464.96..12283.95 rows=1044480 width=269) (actual time=0.071..0.087 rows=1 loops=1)" " Hash Cond: (n1.id = t3.person_id)" " -> Nested Loop Left Join (cost=0.00..49.39 rows=7680 width=160) (actual time=0.051..0.066 rows=1 loops=1)" " -> Nested Loop Left Join (cost=0.00..17.04 rows=3 width=119) (actual time=0.038..0.041 rows=1 loops=1)" " -> Index Scan using name_pkey on name n1 (cost=0.00..8.68 rows=1 width=39) (actual time=0.022..0.024 rows=1 loops=1)" " Index Cond: (id = 2003)" " -> Index Scan using aka_name_idx_person on aka_name (cost=0.00..8.34 rows=1 width=80) (actual time=0.010..0.010 rows=0 loops=1)" " Index Cond: ((aka_name.person_id = 2003) AND (n1.id = aka_name.person_id))" " -> Index Scan using cast_info_idx_pid on cast_info t2 (cost=0.00..10.77 rows=1 width=41) (actual time=0.011..0.020 rows=1 loops=1)" " Index Cond: ((t2.person_id = 2003) AND (n1.id = t2.person_id))" " -> Hash (cost=463.26..463.26 rows=136 width=109) (actual time=0.010..0.010 rows=0 loops=1)" " -> Index Scan using person_info_idx_pid on person_info t3 (cost=0.00..463.26 rows=136 width=109) (actual time=0.009..0.009 rows=0 loops=1)" " Index Cond: (person_id = 2003)" " -> Hash (cost=42697.62..42697.62 rows=2442362 width=74) (actual time=49305.872..49305.872 rows=2442362 loops=1)" " -> Seq Scan on char_name t4 (cost=0.00..42697.62 rows=2442362 width=74) (actual time=14.066..22775.087 rows=2442362 loops=1)" " -> Hash (cost=1.12..1.12 rows=12 width=15) (actual time=0.024..0.024 rows=12 loops=1)" " -> Seq Scan on role_type t5 (cost=0.00..1.12 rows=12 width=15) (actual time=0.012..0.014 rows=12 loops=1)" " -> Hash (cost=31134.07..31134.07 rows=1573507 width=58) (actual time=7841.225..7841.225 rows=1573507 loops=1)" " -> Seq Scan on title t6 (cost=0.00..31134.07 rows=1573507 width=58) (actual time=21.507..2799.443 rows=1573507 loops=1)" " -> Materialize (cost=59525.80..63203.88 rows=294246 width=75) (actual time=812.376..984.958 rows=192075 loops=1)" " -> Sort (cost=59525.80..60261.42 rows=294246 width=75) (actual time=812.363..922.452 rows=192075 loops=1)" " Sort Key: t7.movie_id" " Sort Method: external merge Disk: 24880kB" " -> Seq Scan on aka_title t7 (cost=0.00..6646.46 rows=294246 width=75) (actual time=24.652..164.822 rows=294246 loops=1)" " -> Materialize (cost=11627.38..12884.43 rows=100564 width=16) (actual time=123.819..149.086 rows=41907 loops=1)" " -> Sort (cost=11627.38..11878.79 rows=100564 width=16) (actual time=123.807..138.530 rows=41907 loops=1)" " Sort Key: t8.movie_id" " Sort Method: external merge Disk: 3136kB" " -> Seq Scan on complete_cast t8 (cost=0.00..1549.64 rows=100564 width=16) (actual time=0.013..10.744 rows=100564 loops=1)" " -> Materialize (cost=1.08..1.15 rows=7 width=14) (actual time=0.016..0.029 rows=7 loops=1)" " -> Seq Scan on kind_type t9 (cost=0.00..1.07 rows=7 width=14) (actual time=0.011..0.013 rows=7 loops=1)" " -> Materialize (cost=412704.52..437969.09 rows=2021166 width=40) (actual time=3420.356..4278.545 rows=1028995 loops=1)" " -> Sort (cost=412704.52..417757.43 rows=2021166 width=40) (actual time=3420.349..3953.483 rows=1028995 loops=1)" " Sort Key: t10.movie_id" " Sort Method: external merge Disk: 90960kB" " -> Seq Scan on movie_companies t10 (cost=0.00..35214.66 rows=2021166 width=40) (actual time=13.271..566.893 rows=2021166 loops=1)" " -> Materialize (cost=3144407.19..3269057.42 rows=9972019 width=76) (actual time=65485.672..70083.219 rows=5039009 loops=1)" " -> Sort (cost=3144407.19..3169337.23 rows=9972019 width=76) (actual time=65485.667..68385.550 rows=5038999 loops=1)" " Sort Key: t11.movie_id" " Sort Method: external merge Disk: 735512kB" " -> Seq Scan on movie_info t11 (cost=0.00..212815.19 rows=9972019 width=76) (actual time=15.750..15715.608 rows=9972019 loops=1)" " -> Materialize (cost=207925.01..219867.92 rows=955433 width=50) (actual time=1483.989..1785.636 rows=429401 loops=1)" " -> Sort (cost=207925.01..210313.59 rows=955433 width=50) (actual time=1483.983..1654.165 rows=429401 loops=1)" " Sort Key: t19.movie_id" " Sort Method: external merge Disk: 31720kB" " -> Seq Scan on movie_info_idx t19 (cost=0.00..15047.33 rows=955433 width=50) (actual time=7.284..221.597 rows=955433 loops=1)" " -> Materialize (cost=501605.39..537645.64 rows=2883220 width=12) (actual time=5823.040..6868.242 rows=1597396 loops=1)" " -> Sort (cost=501605.39..508813.44 rows=2883220 width=12) (actual time=5823.026..6477.517 rows=1597396 loops=1)" " Sort Key: t12.movie_id" " Sort Method: external merge Disk: 78888kB" " -> Seq Scan on movie_keyword t12 (cost=0.00..44417.20 rows=2883220 width=12) (actual time=11.672..839.498 rows=2883220 loops=1)" " -> Materialize (cost=141143.93..152995.81 rows=948150 width=16) (actual time=1916.356..2253.004 rows=478358 loops=1)" " -> Sort (cost=141143.93..143514.31 rows=948150 width=16) (actual time=1916.344..2125.698 rows=478358 loops=1)" " Sort Key: t13.linked_movie_id" " Sort Method: external merge Disk: 29632kB" " -> Seq Scan on movie_link t13 (cost=0.00..14607.50 rows=948150 width=16) (actual time=27.610..297.962 rows=948150 loops=1)" " -> Hash (cost=1.18..1.18 rows=18 width=16) (actual time=0.020..0.020 rows=18 loops=1)" " -> Seq Scan on link_type t14 (cost=0.00..1.18 rows=18 width=16) (actual time=0.010..0.012 rows=18 loops=1)" " -> Hash (cost=1537.10..1537.10 rows=91010 width=24) (actual time=54.055..54.055 rows=91010 loops=1)" " -> Seq Scan on keyword t15 (cost=0.00..1537.10 rows=91010 width=24) (actual time=0.006..14.703 rows=91010 loops=1)" " -> Hash (cost=4585.61..4585.61 rows=245461 width=42) (actual time=445.269..445.269 rows=245461 loops=1)" " -> Seq Scan on company_name t16 (cost=0.00..4585.61 rows=245461 width=42) (actual time=12.037..309.961 rows=245461 loops=1)" " -> Hash (cost=1.04..1.04 rows=4 width=25) (actual time=0.013..0.013 rows=4 loops=1)" " -> Seq Scan on company_type t17 (cost=0.00..1.04 rows=4 width=25) (actual time=0.009..0.010 rows=4 loops=1)" " -> Hash (cost=1.04..1.04 rows=4 width=13) (actual time=0.006..0.006 rows=4 loops=1)" " -> Seq Scan on comp_cast_type t18 (cost=0.00..1.04 rows=4 width=13) (actual time=0.002..0.003 rows=4 loops=1)" "Total runtime: 147055.016 ms" Is there anyway to force the name.id = 2003 before it tries to join all the tables together? As you can see, the end result is 4 tuples but it seems like it should be a fast join by using the available index after it limited it down with the name clause, although very complex.

Read the article

Squid + Dans Guardian (simple configuration)

- by The Digital Ninja

I just built a new proxy server and compiled the latest versions of squid and dansguardian. We use basic authentication to select what users are allowed outside of our network. It seems squid is working just fine and accepts my username and password and lets me out. But if i connect to dans guardian, it prompts for username and password and then displays a message saying my username is not allowed to access the internet. Its pulling my username for the error message so i know it knows who i am. The part i get confused on is i thought that part was handled all by squid, and squid is working flawlessly. Can someone please double check my config files and tell me if i'm missing something or there is some new option i must set to get this to work. dansguardian.conf # Web Access Denied Reporting (does not affect logging) # # -1 = log, but do not block - Stealth mode # 0 = just say 'Access Denied' # 1 = report why but not what denied phrase # 2 = report fully # 3 = use HTML template file (accessdeniedaddress ignored) - recommended # reportinglevel = 3 # Language dir where languages are stored for internationalisation. # The HTML template within this dir is only used when reportinglevel # is set to 3. When used, DansGuardian will display the HTML file instead of # using the perl cgi script. This option is faster, cleaner # and easier to customise the access denied page. # The language file is used no matter what setting however. # languagedir = '/etc/dansguardian/languages' # language to use from languagedir. language = 'ukenglish' # Logging Settings # # 0 = none 1 = just denied 2 = all text based 3 = all requests loglevel = 3 # Log Exception Hits # Log if an exception (user, ip, URL, phrase) is matched and so # the page gets let through. Can be useful for diagnosing # why a site gets through the filter. on | off logexceptionhits = on # Log File Format # 1 = DansGuardian format 2 = CSV-style format # 3 = Squid Log File Format 4 = Tab delimited logfileformat = 1 # Log file location # # Defines the log directory and filename. #loglocation = '/var/log/dansguardian/access.log' # Network Settings # # the IP that DansGuardian listens on. If left blank DansGuardian will # listen on all IPs. That would include all NICs, loopback, modem, etc. # Normally you would have your firewall protecting this, but if you want # you can limit it to only 1 IP. Yes only one. filterip = # the port that DansGuardian listens to. filterport = 8080 # the ip of the proxy (default is the loopback - i.e. this server) proxyip = 127.0.0.1 # the port DansGuardian connects to proxy on proxyport = 3128 # accessdeniedaddress is the address of your web server to which the cgi # dansguardian reporting script was copied # Do NOT change from the default if you are not using the cgi. # accessdeniedaddress = 'http://YOURSERVER.YOURDOMAIN/cgi-bin/dansguardian.pl' # Non standard delimiter (only used with accessdeniedaddress) # Default is enabled but to go back to the original standard mode dissable it. nonstandarddelimiter = on # Banned image replacement # Images that are banned due to domain/url/etc reasons including those # in the adverts blacklists can be replaced by an image. This will, # for example, hide images from advert sites and remove broken image # icons from banned domains. # 0 = off # 1 = on (default) usecustombannedimage = 1 custombannedimagefile = '/etc/dansguardian/transparent1x1.gif' # Filter groups options # filtergroups sets the number of filter groups. A filter group is a set of content # filtering options you can apply to a group of users. The value must be 1 or more. # DansGuardian will automatically look for dansguardianfN.conf where N is the filter # group. To assign users to groups use the filtergroupslist option. All users default # to filter group 1. You must have some sort of authentication to be able to map users # to a group. The more filter groups the more copies of the lists will be in RAM so # use as few as possible. filtergroups = 1 filtergroupslist = '/etc/dansguardian/filtergroupslist' # Authentication files location bannediplist = '/etc/dansguardian/bannediplist' exceptioniplist = '/etc/dansguardian/exceptioniplist' banneduserlist = '/etc/dansguardian/banneduserlist' exceptionuserlist = '/etc/dansguardian/exceptionuserlist' # Show weighted phrases found # If enabled then the phrases found that made up the total which excedes # the naughtyness limit will be logged and, if the reporting level is # high enough, reported. on | off showweightedfound = on # Weighted phrase mode # There are 3 possible modes of operation: # 0 = off = do not use the weighted phrase feature. # 1 = on, normal = normal weighted phrase operation. # 2 = on, singular = each weighted phrase found only counts once on a page. # weightedphrasemode = 2 # Positive result caching for text URLs # Caches good pages so they don't need to be scanned again # 0 = off (recommended for ISPs with users with disimilar browsing) # 1000 = recommended for most users # 5000 = suggested max upper limit urlcachenumber = # # Age before they are stale and should be ignored in seconds # 0 = never # 900 = recommended = 15 mins urlcacheage = # Smart and Raw phrase content filtering options # Smart is where the multiple spaces and HTML are removed before phrase filtering # Raw is where the raw HTML including meta tags are phrase filtered # CPU usage can be effectively halved by using setting 0 or 1 # 0 = raw only # 1 = smart only # 2 = both (default) phrasefiltermode = 2 # Lower casing options # When a document is scanned the uppercase letters are converted to lower case # in order to compare them with the phrases. However this can break Big5 and # other 16-bit texts. If needed preserve the case. As of version 2.7.0 accented # characters are supported. # 0 = force lower case (default) # 1 = do not change case preservecase = 0 # Hex decoding options # When a document is scanned it can optionally convert %XX to chars. # If you find documents are getting past the phrase filtering due to encoding # then enable. However this can break Big5 and other 16-bit texts. # 0 = disabled (default) # 1 = enabled hexdecodecontent = 0 # Force Quick Search rather than DFA search algorithm # The current DFA implementation is not totally 16-bit character compatible # but is used by default as it handles large phrase lists much faster. # If you wish to use a large number of 16-bit character phrases then # enable this option. # 0 = off (default) # 1 = on (Big5 compatible) forcequicksearch = 0 # Reverse lookups for banned site and URLs. # If set to on, DansGuardian will look up the forward DNS for an IP URL # address and search for both in the banned site and URL lists. This would # prevent a user from simply entering the IP for a banned address. # It will reduce searching speed somewhat so unless you have a local caching # DNS server, leave it off and use the Blanket IP Block option in the # bannedsitelist file instead. reverseaddresslookups = off # Reverse lookups for banned and exception IP lists. # If set to on, DansGuardian will look up the forward DNS for the IP # of the connecting computer. This means you can put in hostnames in # the exceptioniplist and bannediplist. # It will reduce searching speed somewhat so unless you have a local DNS server, # leave it off. reverseclientiplookups = off # Build bannedsitelist and bannedurllist cache files. # This will compare the date stamp of the list file with the date stamp of # the cache file and will recreate as needed. # If a bsl or bul .processed file exists, then that will be used instead. # It will increase process start speed by 300%. On slow computers this will # be significant. Fast computers do not need this option. on | off createlistcachefiles = on # POST protection (web upload and forms) # does not block forms without any file upload, i.e. this is just for # blocking or limiting uploads # measured in kibibytes after MIME encoding and header bumph # use 0 for a complete block # use higher (e.g. 512 = 512Kbytes) for limiting # use -1 for no blocking #maxuploadsize = 512 #maxuploadsize = 0 maxuploadsize = -1 # Max content filter page size # Sometimes web servers label binary files as text which can be very # large which causes a huge drain on memory and cpu resources. # To counter this, you can limit the size of the document to be # filtered and get it to just pass it straight through. # This setting also applies to content regular expression modification. # The size is in Kibibytes - eg 2048 = 2Mb # use 0 for no limit maxcontentfiltersize = # Username identification methods (used in logging) # You can have as many methods as you want and not just one. The first one # will be used then if no username is found, the next will be used. # * proxyauth is for when basic proxy authentication is used (no good for # transparent proxying). # * ntlm is for when the proxy supports the MS NTLM authentication # protocol. (Only works with IE5.5 sp1 and later). **NOT IMPLEMENTED** # * ident is for when the others don't work. It will contact the computer # that the connection came from and try to connect to an identd server # and query it for the user owner of the connection. usernameidmethodproxyauth = on usernameidmethodntlm = off # **NOT IMPLEMENTED** usernameidmethodident = off # Preemptive banning - this means that if you have proxy auth enabled and a user accesses # a site banned by URL for example they will be denied straight away without a request # for their user and pass. This has the effect of requiring the user to visit a clean # site first before it knows who they are and thus maybe an admin user. # This is how DansGuardian has always worked but in some situations it is less than # ideal. So you can optionally disable it. Default is on. # As a side effect disabling this makes AD image replacement work better as the mime # type is know. preemptivebanning = on # Misc settings # if on it adds an X-Forwarded-For: <clientip> to the HTTP request # header. This may help solve some problem sites that need to know the # source ip. on | off forwardedfor = on # if on it uses the X-Forwarded-For: <clientip> to determine the client # IP. This is for when you have squid between the clients and DansGuardian. # Warning - headers are easily spoofed. on | off usexforwardedfor = off # if on it logs some debug info regarding fork()ing and accept()ing which # can usually be ignored. These are logged by syslog. It is safe to leave # it on or off logconnectionhandlingerrors = on # Fork pool options # sets the maximum number of processes to sporn to handle the incomming # connections. Max value usually 250 depending on OS. # On large sites you might want to try 180. maxchildren = 180 # sets the minimum number of processes to sporn to handle the incomming connections. # On large sites you might want to try 32. minchildren = 32 # sets the minimum number of processes to be kept ready to handle connections. # On large sites you might want to try 8. minsparechildren = 8 # sets the minimum number of processes to sporn when it runs out # On large sites you might want to try 10. preforkchildren = 10 # sets the maximum number of processes to have doing nothing. # When this many are spare it will cull some of them. # On large sites you might want to try 64. maxsparechildren = 64 # sets the maximum age of a child process before it croaks it. # This is the number of connections they handle before exiting. # On large sites you might want to try 10000. maxagechildren = 5000 # Process options # (Change these only if you really know what you are doing). # These options allow you to run multiple instances of DansGuardian on a single machine. # Remember to edit the log file path above also if that is your intention. # IPC filename # # Defines IPC server directory and filename used to communicate with the log process. ipcfilename = '/tmp/.dguardianipc' # URL list IPC filename # # Defines URL list IPC server directory and filename used to communicate with the URL # cache process. urlipcfilename = '/tmp/.dguardianurlipc' # PID filename # # Defines process id directory and filename. #pidfilename = '/var/run/dansguardian.pid' # Disable daemoning # If enabled the process will not fork into the background. # It is not usually advantageous to do this. # on|off ( defaults to off ) nodaemon = off # Disable logging process # on|off ( defaults to off ) nologger = off # Daemon runas user and group # This is the user that DansGuardian runs as. Normally the user/group nobody. # Uncomment to use. Defaults to the user set at compile time. # daemonuser = 'nobody' # daemongroup = 'nobody' # Soft restart # When on this disables the forced killing off all processes in the process group. # This is not to be confused with the -g run time option - they are not related. # on|off ( defaults to off ) softrestart = off maxcontentramcachescansize = 2000 maxcontentfilecachescansize = 20000 downloadmanager = '/etc/dansguardian/downloadmanagers/default.conf' authplugin = '/etc/dansguardian/authplugins/proxy-basic.conf' Squid.conf http_port 3128 hierarchy_stoplist cgi-bin ? acl QUERY urlpath_regex cgi-bin \? cache deny QUERY acl apache rep_header Server ^Apache #broken_vary_encoding allow apache access_log /squid/var/logs/access.log squid hosts_file /etc/hosts auth_param basic program /squid/libexec/ncsa_auth /squid/etc/userbasic.auth auth_param basic children 5 auth_param basic realm proxy auth_param basic credentialsttl 2 hours auth_param basic casesensitive off refresh_pattern ^ftp: 1440 20% 10080 refresh_pattern ^gopher: 1440 0% 1440 refresh_pattern . 0 20% 4320 acl NoAuthNec src <HIDDEN FOR SECURITY> acl BrkRm src <HIDDEN FOR SECURITY> acl Dials src <HIDDEN FOR SECURITY> acl Comps src <HIDDEN FOR SECURITY> acl whsws dstdom_regex -i .opensuse.org .novell.com .suse.com mirror.mcs.an1.gov mirrors.kernerl.org www.suse.de suse.mirrors.tds.net mirrros.usc.edu ftp.ale.org suse.cs.utah.edu mirrors.usc.edu mirror.usc.an1.gov linux.nssl.noaa.gov noaa.gov .kernel.org ftp.ale.org ftp.gwdg.de .medibuntu.org mirrors.xmission.com .canonical.com .ubuntu. acl opensites dstdom_regex -i .mbsbooks.com .bowker.com .usps.com .usps.gov .ups.com .fedex.com go.microsoft.com .microsoft.com .apple.com toolbar.msn.com .contacts.msn.com update.services.openoffice.org fms2.pointroll.speedera.net services.wmdrm.windowsmedia.com windowsupdate.com .adobe.com .symantec.com .vitalbook.com vxn1.datawire.net vxn.datawire.net download.lavasoft.de .download.lavasoft.com .lavasoft.com updates.ls-servers.com .canadapost. .myyellow.com minirick symantecliveupdate.com wm.overdrive.com www.overdrive.com productactivation.one.microsoft.com www.update.microsoft.com testdrive.whoson.com www.columbia.k12.mo.us banners.wunderground.com .kofax.com .gotomeeting.com tools.google.com .dl.google.com .cache.googlevideo.com .gpdl.google.com .clients.google.com cache.pack.google.com kh.google.com maps.google.com auth.keyhole.com .contacts.msn.com .hrblock.com .taxcut.com .merchantadvantage.com .jtv.com .malwarebytes.org www.google-analytics.com dcs.support.xerox.com .dhl.com .webtrendslive.com javadl-esd.sun.com javadl-alt.sun.com .excelsior.edu .dhlglobalmail.com .nessus.org .foxitsoftware.com foxit.vo.llnwd.net installshield.com .mindjet.com .mediascouter.com media.us.elsevierhealth.com .xplana.com .govtrack.us sa.tulsacc.edu .omniture.com fpdownload.macromedia.com webservices.amazon.com acl password proxy_auth REQUIRED acl all src all acl manager proto cache_object acl localhost src 127.0.0.1/255.255.255.255 acl to_localhost dst 127.0.0.0/8 acl SSL_ports port 443 563 631 2001 2005 8731 9001 9080 10000 acl Safe_ports port 80 # http acl Safe_ports port 21 # ftp acl Safe_ports port # https, snews 443 563 acl Safe_ports port 70 # gopher acl Safe_ports port 210 # wais acl Safe_ports port # unregistered ports 1936-65535 acl Safe_ports port 280 # http-mgmt acl Safe_ports port 488 # gss-http acl Safe_ports port 10000 acl Safe_ports port 631 acl Safe_ports port 901 # SWAT acl purge method PURGE acl CONNECT method CONNECT acl UTubeUsers proxy_auth "/squid/etc/utubeusers.list" acl RestrictUTube dstdom_regex -i youtube.com acl RestrictFacebook dstdom_regex -i facebook.com acl FacebookUsers proxy_auth "/squid/etc/facebookusers.list" acl BuemerKEC src 10.10.128.0/24 acl MBSsortnet src 10.10.128.0/26 acl MSNExplorer browser -i MSN acl Printers src <HIDDEN FOR SECURITY> acl SpecialFolks src <HIDDEN FOR SECURITY> # streaming download acl fails rep_mime_type ^.*mms.* acl fails rep_mime_type ^.*ms-hdr.* acl fails rep_mime_type ^.*x-fcs.* acl fails rep_mime_type ^.*x-ms-asf.* acl fails2 urlpath_regex dvrplayer mediastream mms:// acl fails2 urlpath_regex \.asf$ \.afx$ \.flv$ \.swf$ acl deny_rep_mime_flashvideo rep_mime_type -i video/flv acl deny_rep_mime_shockwave rep_mime_type -i ^application/x-shockwave-flash$ acl x-type req_mime_type -i ^application/octet-stream$ acl x-type req_mime_type -i application/octet-stream acl x-type req_mime_type -i ^application/x-mplayer2$ acl x-type req_mime_type -i application/x-mplayer2 acl x-type req_mime_type -i ^application/x-oleobject$ acl x-type req_mime_type -i application/x-oleobject acl x-type req_mime_type -i application/x-pncmd acl x-type req_mime_type -i ^video/x-ms-asf$ acl x-type2 rep_mime_type -i ^application/octet-stream$ acl x-type2 rep_mime_type -i application/octet-stream acl x-type2 rep_mime_type -i ^application/x-mplayer2$ acl x-type2 rep_mime_type -i application/x-mplayer2 acl x-type2 rep_mime_type -i ^application/x-oleobject$ acl x-type2 rep_mime_type -i application/x-oleobject acl x-type2 rep_mime_type -i application/x-pncmd acl x-type2 rep_mime_type -i ^video/x-ms-asf$ acl RestrictHulu dstdom_regex -i hulu.com acl broken dstdomain cms.montgomerycollege.edu events.columbiamochamber.com members.columbiamochamber.com public.genexusserver.com acl RestrictVimeo dstdom_regex -i vimeo.com acl http_port port 80 #http_reply_access deny deny_rep_mime_flashvideo #http_reply_access deny deny_rep_mime_shockwave #streaming files #http_access deny fails #http_reply_access deny fails #http_access deny fails2 #http_reply_access deny fails2 #http_access deny x-type #http_reply_access deny x-type #http_access deny x-type2 #http_reply_access deny x-type2 follow_x_forwarded_for allow localhost acl_uses_indirect_client on log_uses_indirect_client on http_access allow manager localhost http_access deny manager http_access allow purge localhost http_access deny purge http_access allow SpecialFolks http_access deny CONNECT !SSL_ports http_access allow whsws http_access allow opensites http_access deny BuemerKEC !MBSsortnet http_access deny BrkRm RestrictUTube RestrictFacebook RestrictVimeo http_access allow RestrictUTube UTubeUsers http_access deny RestrictUTube http_access allow RestrictFacebook FacebookUsers http_access deny RestrictFacebook http_access deny RestrictHulu http_access allow NoAuthNec http_access allow BrkRm http_access allow FacebookUsers RestrictVimeo http_access deny RestrictVimeo http_access allow Comps http_access allow Dials http_access allow Printers http_access allow password http_access deny !Safe_ports http_access deny SSL_ports !CONNECT http_access allow http_port http_access deny all http_reply_access allow all icp_access allow all access_log /squid/var/logs/access.log squid visible_hostname proxy.site.com forwarded_for off coredump_dir /squid/cache/ #header_access Accept-Encoding deny broken #acl snmppublic snmp_community mysecretcommunity #snmp_port 3401 #snmp_access allow snmppublic all cache_mem 3 GB #acl snmppublic snmp_community mbssquid #snmp_port 3401 #snmp_access allow snmppublic all

Search Results

Search found 21942 results on 878 pages for 'named query'.

Page 623/878 | < Previous Page | 619 620 621 622 623 624 625 626 627 628 629 630 | Next Page >

- by Web Master

- by user22

- by Don

- by Naitik Semwaal

- by Rick Strahl

- by Rob Farley

- by [email protected]

- by Phil Factor

- by Gregory Burd

- by john.rose

- by Jesse

- by Mat Keep

- by Bertrand Le Roy

- by jamiet

- by vikram.shukla(at)oracle.com

- by Madhu Nair

- by Karoly Vegh

- by Vishal

- by Dave Ballantyne

- by ScottGu

- by arungupta

- by ???02

- by blast83

- by The Digital Ninja

< Previous Page | 619 620 621 622 623 624 625 626 627 628 629 630 | Next Page >