Search Results

Search found 3821 results on 153 pages for 'django aggregation'.

Page 149/153 | < Previous Page | 145 146 147 148 149 150 151 152 153 | Next Page >

Parallelism in .NET – Part 10, Cancellation in PLINQ and the Parallel class

- by Reed

Many routines are parallelized because they are long running processes. When writing an algorithm that will run for a long period of time, its typically a good practice to allow that routine to be cancelled. I previously discussed terminating a parallel loop from within, but have not demonstrated how a routine can be cancelled from the caller’s perspective. Cancellation in PLINQ and the Task Parallel Library is handled through a new, unified cooperative cancellation model introduced with .NET 4.0. Cancellation in .NET 4 is based around a new, lightweight struct called CancellationToken. A CancellationToken is a small, thread-safe value type which is generated via a CancellationTokenSource. There are many goals which led to this design. For our purposes, we will focus on a couple of specific design decisions: Cancellation is cooperative. A calling method can request a cancellation, but it’s up to the processing routine to terminate – it is not forced. Cancellation is consistent. A single method call requests a cancellation on every copied CancellationToken in the routine. Let’s begin by looking at how we can cancel a PLINQ query. Supposed we wanted to provide the option to cancel our query from Part 6: double min = collection .AsParallel() .Min(item => item.PerformComputation()); .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } We would rewrite this to allow for cancellation by adding a call to ParallelEnumerable.WithCancellation as follows: var cts = new CancellationTokenSource(); // Pass cts here to a routine that could, // in parallel, request a cancellation try { double min = collection .AsParallel() .WithCancellation(cts.Token) .Min(item => item.PerformComputation()); } catch (OperationCanceledException e) { // Query was cancelled before it finished } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } Here, if the user calls cts.Cancel() before the PLINQ query completes, the query will stop processing, and an OperationCanceledException will be raised. Be aware, however, that cancellation will not be instantaneous. When cts.Cancel() is called, the query will only stop after the current item.PerformComputation() elements all finish processing. cts.Cancel() will prevent PLINQ from scheduling a new task for a new element, but will not stop items which are currently being processed. This goes back to the first goal I mentioned – Cancellation is cooperative. Here, we’re requesting the cancellation, but it’s up to PLINQ to terminate. If we wanted to allow cancellation to occur within our routine, we would need to change our routine to accept a CancellationToken, and modify it to handle this specific case: public void PerformComputation(CancellationToken token) { for (int i=0; i<this.iterations; ++i) { // Add a check to see if we've been canceled // If a cancel was requested, we'll throw here token.ThrowIfCancellationRequested(); // Do our processing now this.RunIteration(i); } } With this overload of PerformComputation, each internal iteration checks to see if a cancellation request was made, and will throw an OperationCanceledException at that point, instead of waiting until the method returns. This is good, since it allows us, as developers, to plan for cancellation, and terminate our routine in a clean, safe state. This is handled by changing our PLINQ query to: try { double min = collection .AsParallel() .WithCancellation(cts.Token) .Min(item => item.PerformComputation(cts.Token)); } catch (OperationCanceledException e) { // Query was cancelled before it finished } PLINQ is very good about handling this exception, as well. There is a very good chance that multiple items will raise this exception, since the entire purpose of PLINQ is to have multiple items be processed concurrently. PLINQ will take all of the OperationCanceledException instances raised within these methods, and merge them into a single OperationCanceledException in the call stack. This is done internally because we added the call to ParallelEnumerable.WithCancellation. If, however, a different exception is raised by any of the elements, the OperationCanceledException as well as the other Exception will be merged into a single AggregateException. The Task Parallel Library uses the same cancellation model, as well. Here, we supply our CancellationToken as part of the configuration. The ParallelOptions class contains a property for the CancellationToken. This allows us to cancel a Parallel.For or Parallel.ForEach routine in a very similar manner to our PLINQ query. As an example, we could rewrite our Parallel.ForEach loop from Part 2 to support cancellation by changing it to: try { var cts = new CancellationTokenSource(); var options = new ParallelOptions() { CancellationToken = cts.Token }; Parallel.ForEach(customers, options, customer => { // Run some process that takes some time... DateTime lastContact = theStore.GetLastContact(customer); TimeSpan timeSinceContact = DateTime.Now - lastContact; // Check for cancellation here options.CancellationToken.ThrowIfCancellationRequested(); // If it's been more than two weeks, send an email, and update... if (timeSinceContact.Days > 14) { theStore.EmailCustomer(customer); customer.LastEmailContact = DateTime.Now; } }); } catch (OperationCanceledException e) { // The loop was cancelled } Notice that here we use the same approach taken in PLINQ. The Task Parallel Library will automatically handle our cancellation in the same manner as PLINQ, providing a clean, unified model for cancellation of any parallel routine. The TPL performs the same aggregation of the cancellation exceptions as PLINQ, as well, which is why a single exception handler for OperationCanceledException will cleanly handle this scenario. This works because we’re using the same CancellationToken provided in the ParallelOptions. If a different exception was thrown by one thread, or a CancellationToken from a different CancellationTokenSource was used to raise our exception, we would instead receive all of our individual exceptions merged into one AggregateException.

Read the article
Parallelism in .NET – Part 5, Partitioning of Work

- by Reed

When parallelizing any routine, we start by decomposing the problem. Once the problem is understood, we need to break our work into separate tasks, so each task can be run on a different processing element. This process is called partitioning. Partitioning our tasks is a challenging feat. There are opposing forces at work here: too many partitions adds overhead, too few partitions leaves processors idle. Trying to work the perfect balance between the two extremes is the goal for which we should aim. Luckily, the Task Parallel Library automatically handles much of this process. However, there are situations where the default partitioning may not be appropriate, and knowledge of our routines may allow us to guide the framework to making better decisions. First off, I’d like to say that this is a more advanced topic. It is perfectly acceptable to use the parallel constructs in the framework without considering the partitioning taking place. The default behavior in the Task Parallel Library is very well-behaved, even for unusual work loads, and should rarely be adjusted. I have found few situations where the default partitioning behavior in the TPL is not as good or better than my own hand-written partitioning routines, and recommend using the defaults unless there is a strong, measured, and profiled reason to avoid using them. However, understanding partitioning, and how the TPL partitions your data, helps in understanding the proper usage of the TPL. I indirectly mentioned partitioning while discussing aggregation. Typically, our systems will have a limited number of Processing Elements (PE), which is the terminology used for hardware capable of processing a stream of instructions. For example, in a standard Intel i7 system, there are four processor cores, each of which has two potential hardware threads due to Hyperthreading. This gives us a total of 8 PEs – theoretically, we can have up to eight operations occurring concurrently within our system. In order to fully exploit this power, we need to partition our work into Tasks. A task is a simple set of instructions that can be run on a PE. Ideally, we want to have at least one task per PE in the system, since fewer tasks means that some of our processing power will be sitting idle. A naive implementation would be to just take our data, and partition it with one element in our collection being treated as one task. When we loop through our collection in parallel, using this approach, we’d just process one item at a time, then reuse that thread to process the next, etc. There’s a flaw in this approach, however. It will tend to be slower than necessary, often slower than processing the data serially. The problem is that there is overhead associated with each task. When we take a simple foreach loop body and implement it using the TPL, we add overhead. First, we change the body from a simple statement to a delegate, which must be invoked. In order to invoke the delegate on a separate thread, the delegate gets added to the ThreadPool’s current work queue, and the ThreadPool must pull this off the queue, assign it to a free thread, then execute it. If our collection had one million elements, the overhead of trying to spawn one million tasks would destroy our performance. The answer, here, is to partition our collection into groups, and have each group of elements treated as a single task. By adding a partitioning step, we can break our total work into small enough tasks to keep our processors busy, but large enough tasks to avoid overburdening the ThreadPool. There are two clear, opposing goals here: Always try to keep each processor working, but also try to keep the individual partitions as large as possible. When using Parallel.For, the partitioning is always handled automatically. At first, partitioning here seems simple. A naive implementation would merely split the total element count up by the number of PEs in the system, and assign a chunk of data to each processor. Many hand-written partitioning schemes work in this exactly manner. This perfectly balanced, static partitioning scheme works very well if the amount of work is constant for each element. However, this is rarely the case. Often, the length of time required to process an element grows as we progress through the collection, especially if we’re doing numerical computations. In this case, the first PEs will finish early, and sit idle waiting on the last chunks to finish. Sometimes, work can decrease as we progress, since previous computations may be used to speed up later computations. In this situation, the first chunks will be working far longer than the last chunks. In order to balance the workload, many implementations create many small chunks, and reuse threads. This adds overhead, but does provide better load balancing, which in turn improves performance. The Task Parallel Library handles this more elaborately. Chunks are determined at runtime, and start small. They grow slowly over time, getting larger and larger. This tends to lead to a near optimum load balancing, even in odd cases such as increasing or decreasing workloads. Parallel.ForEach is a bit more complicated, however. When working with a generic IEnumerable<T>, the number of items required for processing is not known in advance, and must be discovered at runtime. In addition, since we don’t have direct access to each element, the scheduler must enumerate the collection to process it. Since IEnumerable<T> is not thread safe, it must lock on elements as it enumerates, create temporary collections for each chunk to process, and schedule this out. By default, it uses a partitioning method similar to the one described above. We can see this directly by looking at the Visual Partitioning sample shipped by the Task Parallel Library team, and available as part of the Samples for Parallel Programming. When we run the sample, with four cores and the default, Load Balancing partitioning scheme, we see this: The colored bands represent each processing core. You can see that, when we started (at the top), we begin with very small bands of color. As the routine progresses through the Parallel.ForEach, the chunks get larger and larger (seen by larger and larger stripes). Most of the time, this is fantastic behavior, and most likely will out perform any custom written partitioning. However, if your routine is not scaling well, it may be due to a failure in the default partitioning to handle your specific case. With prior knowledge about your work, it may be possible to partition data more meaningfully than the default Partitioner. There is the option to use an overload of Parallel.ForEach which takes a Partitioner<T> instance. The Partitioner<T> class is an abstract class which allows for both static and dynamic partitioning. By overriding Partitioner<T>.SupportsDynamicPartitions, you can specify whether a dynamic approach is available. If not, your custom Partitioner<T> subclass would override GetPartitions(int), which returns a list of IEnumerator<T> instances. These are then used by the Parallel class to split work up amongst processors. When dynamic partitioning is available, GetDynamicPartitions() is used, which returns an IEnumerable<T> for each partition. If you do decide to implement your own Partitioner<T>, keep in mind the goals and tradeoffs of different partitioning strategies, and design appropriately. The Samples for Parallel Programming project includes a ChunkPartitioner class in the ParallelExtensionsExtras project. This provides example code for implementing your own, custom allocation strategies, including a static allocator of a given chunk size. Although implementing your own Partitioner<T> is possible, as I mentioned above, this is rarely required or useful in practice. The default behavior of the TPL is very good, often better than any hand written partitioning strategy.

Read the article
Top Tweets SOA Partner Community – March 2012

- by JuergenKress

Send your tweets @soacommunity #soacommunity and follow us at http://twitter.com/soacommunity SOA Community ?SOA Community Newsletter February 2012 wp.me/p10C8u-o0 Marc ?Reading through the #OFM 11.1.1.6 , patchset 5 documentation. What is the best way to upgrade your whole dev…prd street. SOA Community Thanks for the successful and super interesting #sbidays ! Wonderful discussions around the Integration, case management and security tracks Torsten Winterberg Schon den neuen Opitz Technology-Blog gebookmarked? The Cattle Crew bit.ly/yLPwBD wird ab sofort regelmäßig Erkenntnisse posten. OTNArchBeat ? Unit Testing Asynchronous BPEL Processes Using soapUI | @DanielAmadei bit.ly/x9NsS9 Rolando Carrasco ?Video de Human Task en BPM 11g. Por @edwardo040. bit.ly/wki9CA cc @OracleBPM @OracleSOA @soacommunity View video Marcel Mertin SOA Security Hands-On by Dirk Krafzig and Mamoon Yunus at #sbidays is also great! SOA Community Workshop day #sbidays #BPMN2.0 by Volker Stiehl from #SAP great training – now I can model & execute in #bpmsuite #soacommunity Simone Geib ?Just updated our advanced #soasuite #otn page with a number of very interesting @orclateamsoa blog posts: bit.ly/advancedsoasui… OTNArchBeat ? Start Small, Grow Fast: SOA Best Practices article by @biemond, @rluttikhuizen, @demed bit.ly/yem9Zv Steffen Miller ? Nice new features in SOA Suite Business Rules #PS5 Testing rules with scenarios and output validation bit.ly/zj64Q3 @SOACOMMUNITY OTNArchBeat ? Reply SOA Blackbelt training by David Shaffer, April 30th–May 4th 2012 bit.ly/xGdC24 OTNArchBeat ? What have BPM, big data, social tools, and business models got in common? | Andy Mulholland bit.ly/xUkOGf SOA Community ? Live hacking at #sbidays – cheaper shopping, bias cracking, payment systems, secure your SOA! pic.twitter.com/y7YaIdug SOA Community Future #BPM & #ACM solutions can make use of ontology’s, based on #sqarql #sbidays pic.twitter.com/xLb1Z5zs Simone Geib ? @soacommunity: SOA Blackbelt training by David Shaffer, April 30th–May 4th 2012 wp.me/p10C8u-nX Biemond Changing your ADF Connections in Enterprise Manager with PS5: With Patch Set 5 of Fusion Middleware you can fina… bit.ly/zF7Rb1 Marc ? HUGE (!) CPU and Heap improvement on Oracle Fusion Middleware tinyurl.com/762drzp @wlscommunity @soacommunity #OSB #SOA #WLS SOA Community ?Networking @ SOA & BPM Partner Community blogs.oracle.com/soacommunity/e… #soacommunity #otn #opn #oracle SOA Community ?Published the SOA Partner Community newsletter February edition – READ it. Not yet a member? oracle.com/goto/emea/soa #soacommunity #otn #opn AMIS, Oracle & Java Blog by Lucas Jellema: "Book Review: Do More with SOA Integration: Best of Packt (december 2011, various authors)" bit.ly/wq633E Jon petter hjulstad @SOASimone Excellent summary! Lots of new features! Simone Geib ?Do you want to know what’s new in #soasuite #PS5? Go to bit.ly/xBX06f and let me know what you think SOA Community ? Unit Testing Asynchronous BPEL Processes Using soapUI oracle.com/technetwork/ar… #soacommunity #soa #otn #oracle #bpel Retweeted by SOA Community View media Retweeted by SOA Community Eric Elzinga ? Oracle Fusion Middleware Partner Community Forum Malage, The Overview, bit.ly/AA9BKd #ofmforum SOA&Cloud Symposium ? The February issue of the Service Technology Magazine is now published. servicetechmag.com SOA Community ? Oracle SOA Suite 11g Database Growth Management – must read! oracle.com/technetwork/da… #soacommunity #soa #purging demed ? Have you exposed internal processes to mobile devices using #oraclesoa? Interested in an article? DM me! #osb #rest #multichannel #mobile orclateamsoa ? A-Team SOA Blog: Enhanced version of Thread Dump Analyzer (TDA A-Team) ow.ly/1hpk7l SOA Community Reply BPM Suite #PS5 (11.1.1.6) available for download soacommunity.wordpress.com/2012/02/22/soa… Send us your feedback! #soacommunity #bpmsuite #opn SOA Community ? SOA Suite #PS5 (11.1.1.6) available for download soacommunity.wordpress.com/2012/02/22/soa… Send us your feedback! #soacommunity #soasuite SOA Community BPM Suite #PS5 1(1.1.1.6) available for download. List of new BPM features blogs.oracle.com/soacommunity/e… #soacommunity #bpm #bpmsuite #opn OracleBlogs BPM in Utilties Industry ow.ly/1hC3fp Retweeted by SOA Community OTNArchBeat ? Demystifying Oracle Enterprise Gateway | Naresh Persaud bit.ly/xtDNe2 OTNArchBeat ? Architect’s Guide to Big Data; Test BPEL Processes Using SoapUI; Development Debate bit.ly/xbDYSo Frank Nimphius ? Finished my book review of "Do More with SOA Integration: Best of Packt ". Here are my review comments: bit.ly/x2k9OZ Lucas Jellema ? That is my one stop-and-go download center for #PS5 : edelivery.oracle.com/EPD/Download/g… Lucas Jellema ? Interesting piece of documentation: Fusion Applications Extensibility Guide – docs.oracle.com/cd/E15586_01/f… source for design time @ run time inspira Lucas Jellema ? Strongly improved support for testing Business Rules at Design Time in #PS5 see docs.oracle.com/cd/E23943_01/u… Lucas Jellema ? SOA Suite 11gR1 PS5: new BPEL Component testing – docs.oracle.com/cd/E23943_01/d… Lucas Jellema ? PS5 available for CEP (Complex Event Processing) – a personal favorite of mine : oracle.com/technetwork/mi… Lucas Jellema ?What’s New in Fusion Developer’s Guide 11gR1 PS5: docs.oracle.com/cd/E23943_01/w… Lucas Jellema ? BPMN Correlation (FMW 11gR1 PS5): docs.oracle.com/cd/E23943_01/d… Lucas Jellema ? Modifying running BPM Process instances (FMW 11gR1 PS5): docs.oracle.com/cd/E23943_01/d… Lucas Jellema ? SOA Suite 11gR1 PS5 – new aggregation pattern: docs.oracle.com/cd/E23943_01/d… routing multiple messages to same instance Melvin van der Kuijl ? Automating Testing of SOA Composite Applications in PS5. docs.oracle.com/cd/E23943_01/d… Cato Aune ? SOA suite PS5 Enterprise Deployment Guide is available in ePub docs.oracle.com/cd/E23943_01/c… . Much better than pdf on Galaxy Note SOA Community ?JDeveloper 11.1.1.6 is available for download bit.ly/wGYrwE #soacommunity SOA Community ? Your first experience #PS5 – let us know @soacommunity – send us your tweets and blog posts! #soacommunity Jon petter hjulstad ? WLS 10.3.6 New features, ex better logging of jdbc use: docs.oracle.com/cd/E23943_01/w… Heidi Buelow ? Get it now! RT @soacommunity: BPM Suite PS5 11.1.1.6 available for download bit.ly/AgagT5 #bpm #soacommunity Jon petter hjulstad ?SOA Suite PS5 EDG contains OSB! docs.oracle.com/cd/E23943_01/c… Jon petter hjulstad ? Testing Oracle Rules from JDeveloper is easier in PS5: docs.oracle.com/cd/E23943_01/u… Biemond® ? What’s New in Oracle Service Bus 11.1.1.6.0 oracle.com/technetwork/mi… Jon petter hjulstad ? Adminguide New and Changed Features for PS5, ex GridLink data sources: docs.oracle.com/cd/E23943_01/c… Retweeted by SOA Community Andreas Koop ? Unbelievable! #OFM Doc Lib growth from 11gPS4->11gPS5 by 1.2G! View media SOA Community ?ODI PS5 is available oracle.com/technetwork/mi… #odi #soacommunity 22 Feb View media SOA Community Service Bus 11g Development Cookbook soacommunity.wordpress.com/2012/02/20/ser… #osb #soacommunity #ace #opn View media For regular information on Oracle SOA Suite become a member in the SOA Partner Community for registration please visit www.oracle.com/goto/emea/soa (OPN account required) Blog Twitter LinkedIn Mix Forum Technorati Tags: soacommunity,twitter,Oracle,SOA Community,Jürgen Kress,OPN,SOA,BPM

Read the article
Oracle Data Mining a Star Schema: Telco Churn Case Study

- by charlie.berger

There is a complete and detailed Telco Churn case study "How to" Blog Series just posted by Ari Mozes, ODM Dev. Manager. In it, Ari provides detailed guidance in how to leverage various strengths of Oracle Data Mining including the ability to: mine Star Schemas and join tables and views together to obtain a complete 360 degree view of a customer combine transactional data e.g. call record detail (CDR) data, etc. define complex data transformation, model build and model deploy analytical methodologies inside the Database His blog is posted in a multi-part series. Below are some opening excerpts for the first 3 blog entries. This is an excellent resource for any novice to skilled data miner who wants to gain competitive advantage by mining their data inside the Oracle Database. Many thanks Ari! Mining a Star Schema: Telco Churn Case Study (1 of 3) One of the strengths of Oracle Data Mining is the ability to mine star schemas with minimal effort. Star schemas are commonly used in relational databases, and they often contain rich data with interesting patterns. While dimension tables may contain interesting demographics, fact tables will often contain user behavior, such as phone usage or purchase patterns. Both of these aspects - demographics and usage patterns - can provide insight into behavior.Churn is a critical problem in the telecommunications industry, and companies go to great lengths to reduce the churn of their customer base. One case study1 describes a telecommunications scenario involving understanding, and identification of, churn, where the underlying data is present in a star schema. That case study is a good example for demonstrating just how natural it is for Oracle Data Mining to analyze a star schema, so it will be used as the basis for this series of posts...... Mining a Star Schema: Telco Churn Case Study (2 of 3) This post will follow the transformation steps as described in the case study, but will use Oracle SQL as the means for preparing data. Please see the previous post for background material, including links to the case study and to scripts that can be used to replicate the stages in these posts.1) Handling missing values for call data recordsThe CDR_T table records the number of phone minutes used by a customer per month and per call type (tariff). For example, the table may contain one record corresponding to the number of peak (call type) minutes in January for a specific customer, and another record associated with international calls in March for the same customer. This table is likely to be fairly dense (most type-month combinations for a given customer will be present) due to the coarse level of aggregation, but there may be some missing values. Missing entries may occur for a number of reasons: the customer made no calls of a particular type in a particular month, the customer switched providers during the timeframe, or perhaps there is a data entry problem. In the first situation, the correct interpretation of a missing entry would be to assume that the number of minutes for the type-month combination is zero. In the other situations, it is not appropriate to assume zero, but rather derive some representative value to replace the missing entries. The referenced case study takes the latter approach. The data is segmented by customer and call type, and within a given customer-call type combination, an average number of minutes is computed and used as a replacement value.In SQL, we need to generate additional rows for the missing entries and populate those rows with appropriate values. To generate the missing rows, Oracle's partition outer join feature is a perfect fit. select cust_id, cdre.tariff, cdre.month, minsfrom cdr_t cdr partition by (cust_id) right outer join (select distinct tariff, month from cdr_t) cdre on (cdr.month = cdre.month and cdr.tariff = cdre.tariff); ....... Mining a Star Schema: Telco Churn Case Study (3 of 3) Now that the "difficult" work is complete - preparing the data - we can move to building a predictive model to help identify and understand churn.The case study suggests that separate models be built for different customer segments (high, medium, low, and very low value customer groups). To reduce the data to a single segment, a filter can be applied: create or replace view churn_data_high asselect * from churn_prep where value_band = 'HIGH'; It is simple to take a quick look at the predictive aspects of the data on a univariate basis. While this does not capture the more complex multi-variate effects as would occur with the full-blown data mining algorithms, it can give a quick feel as to the predictive aspects of the data as well as validate the data preparation steps. Oracle Data Mining includes a predictive analytics package which enables quick analysis. begin dbms_predictive_analytics.explain( 'churn_data_high','churn_m6','expl_churn_tab'); end; /select * from expl_churn_tab where rank <= 5 order by rank; ATTRIBUTE_NAME ATTRIBUTE_SUBNAME EXPLANATORY_VALUE RANK-------------------- ----------------- ----------------- ----------LOS_BAND .069167052 1MINS_PER_TARIFF_MON PEAK-5 .034881648 2REV_PER_MON REV-5 .034527798 3DROPPED_CALLS .028110322 4MINS_PER_TARIFF_MON PEAK-4 .024698149 5From the above results, it is clear that some predictors do contain information to help identify churn (explanatory value > 0). The strongest uni-variate predictor of churn appears to be the customer's (binned) length of service. The second strongest churn indicator appears to be the number of peak minutes used in the most recent month. The subname column contains the interior piece of the DM_NESTED_NUMERICALS column described in the previous post. By using the object relational approach, many related predictors are included within a single top-level column. ..... NOTE: These are just EXCERPTS. Click here to start reading the Oracle Data Mining a Star Schema: Telco Churn Case Study from the beginning.

Read the article
B2B and B2C Commerce are alike… but a little different – Oracle Commerce named Leader in Forrester B2B Commerce Wave

- by Katrina Gosek

We weren’t surprised to see Oracle Commerce positioned as a Leader in Forrester’s first Commerce Wave focused on B2B, released earlier this month. The reports validates much of what we’ve heard from our largest customers – the world’s largest distribution, manufacturing and high-tech customers who sell billions of dollars of goods and services to other businesses through their Web channels. More importantly, the report confirms something very important: B2B and B2C Commerce are alike… but a little different. B2B and B2C Commerce are alike… Clearly, B2C experiences have set expectations for B2B. Every B2B buyer is a consumer at home and brings the same expectations to a website selling electronic components, aftermarket parts, or MRO products. Forrester calls these rich consumer-based capabilities that help B2B customers do their jobs “table stakes”: search & navigation, promotions, cross-channel commerce and mobile: “Whether they are just beginning to sell online or are in the late stages of launching a next-generation site, B2B eCommerce operations today must: offer a customer experience standard comparable to what leading b2c sites now offer; address the growing influence that mobile devices are having in the workplace; make a qualitative and quantitative business case that drives sustained investment.” Just five years ago, many of our B2B customers’ online business comprised only 5-10% of their total revenue. Today, when we speak to those same brands, we hear about double and triple digit growth in their online channels. Many have seen the percentage of the business they perform in their web channels cross the 30-50% threshold. You can hear first-hand from several Oracle Commerce B2B customers about the success they are seeing, and what they’re trying to accomplish (Carolina Biological, Premier Farnell, DeliXL, Elsevier). This momentum is likely the reason Forrester broke out the separate B2B Commerce Wave from the B2C Wave. In fact, B2B is becoming the larger force in commerce, expected to collect twice the online dollars of B2C this year ($559 billion). But a little different… Despite the similarities, there is a key and very important difference between B2C and B2B. Unlike a consumer shopping for shoes, a business shopper buying from a distributor or manufacturer is coming to the Web channel as a part of their job. So in addition to a rich, consumer-like experience this shopper expects, these B2B buyers need quoting tools and complex pricing capabilities, like eProcurement, bulk order entry, and other self-service tools such as account, contract and organization management. Forrester also is emphasizing three additional “back-end” tools and capabilities their clients say they need to drive growth in their B2B online channels: i) product information management (PIM), which provides a single system of record for large part lists and product catalogs; ii) web content management (WCM), needed to manage large volumes of unstructured marketing information, and iii) order management systems (OMS), which manage and orchestrate the complex B2B order life cycle from quote through approval, submission to manufacturing, distribution and delivery. We would like to expand on each of these 3 areas: As Forrester highlights, back-end PIM is definitely needed by B2B Commerce providers. Most B2B companies have made significant investments in enterprise-grade PIMs, given the importance of product data management for aggregation and syndication of content, product attribution, analytics, and handling of complex workflows. While in principle it may sound appealing to have a PIM as part of a commerce offering (especially for SMBs who have to do more with less), our customers have typically found that PIM in a commerce platform is largely redundant with what they already have in-place, and is not fully-featured or robust enough to handle the complexity of the product data sets that B2B distributors and manufacturers usually handle. To meet the PIM needs for commerce, Oracle offers enterprise PIM (Product Hub/Fusion PIM) and a robust enterprise data quality product (EDQP) integrated with the Oracle Commerce solution. These are key differentiators of our offering and these capabilities are becoming even more tightly integrated with Oracle Commerce over time. For Commerce, what customers really need is a robust product catalog and content management system for enabling business users to further enrich and ready catalog and content data to be presented and sold online. This has been a significant area of investment in the Oracle Commerce platform , which continue to get stronger. We see this combination of capabilities as best meeting the needs of our customers for a commerce platform without adding a largely redundant, less functional PIM in the commerce front-end. On the topic of web content management, we were pleased to see Forrester recognize Oracle’s unique functional capabilities in this area and the “unique opportunity in the market to lead the convergence of commerce and content management with the amalgamation of Oracle Commerce with WebCenter Sites (formally FatWire).” Strong content management capabilities are critical for distributors and manufacturers who are frequently serving an engineering audience coming to their websites to conduct product research in search of technical data sheets, drawings, videos and more. The convergence of content, commerce, and experience is critical for B2B brands selling online. Regarding order management, Forrester notes that many businesses use their existing back-end enterprise resource planning (ERP) systems to manage order life cycles. We hear the same from most of our B2B customers, as they already have an ERP system—if not several of them—and are not interested in yet another one. So what do we take away from the Wave results? Forrester notes that the Oracle Commerce Platform “has always had strong B2B commerce capabilities and Oracle has an exhaustive list of B2B customers using the solution.” What makes us excited about developing leading B2B solutions are the close relationships with our customers and the clear opportunity in the market – which we’ll address in an exciting new release in the coming months. Oracle has one of the world’s largest B2B customer bases, providing leading solutions across key business-to-business functions – from marketing, sales automation, and service to master data management, and ERP. To learn more about Oracle’s Commerce product vision and strategy, visit our website and check out these other B2B Commerce Resources: - 2013 B2B Commerce Trends Report - B2B Commerce Whitepaper: Consumerization, Complexity, Change - B2B Commerce Webcast: What Industry Trend Setters Do Right - Internet Retailer, Web Drives Sales for B2B Companies - Internet Retailer, The Web Means Business: B2B Companies Beef Up Their Websites, borrowing from b2c retailers and breaking new ground - Internet Retailer, B2B e-Commerce is poised for growth ----------THIS DOCUMENT IS FOR INFORMATIONAL PURPOSES ONLY AND MAY NOT BE INCORPORATED INTO A CONTRACT OR AGREEMENT

Read the article
202 blog articles

- by mprove

All my blog articles under blogs.oracle.com since August 2005: 202 blog articles Apr 2012 blogs.oracle.com design patch Mar 2012 Interaction 12 - Critique Mar 2012 Typing. Clicking. Dancing. Feb 2012 Desktop Mobility in Hospitals with Oracle VDI /video Feb 2012 Interaction 12 in Dublin - Highlights of Day 3 Feb 2012 Interaction 12 in Dublin - Highlights of Day 2 Feb 2012 Interaction 12 in Dublin - Highlights of Day 1 Feb 2012 Shit Interaction Designers Say Feb 2012 Tips'n'Tricks for WebCenter #3: How to display custom page titles in Spaces Jan 2012 Tips'n'Tricks for WebCenter #2: How to create an Admin menu in Spaces and save a lot of time Jan 2012 Tips'n'Tricks for WebCenter #1: How to apply custom resources in Spaces Jan 2012 Merry XMas and a Happy 2012! Dec 2011 One Year Oracle SocialChat - The Movie Nov 2011 Frank Ludolph's Last Working Day Nov 2011 Hans Rosling at TED Oct 2011 200 Countries x 200 Years Oct 2011 Blog Aggregation for Desktop Virtualization Oct 2011 Oracle VDI at OOW 2011 Sep 2011 Design for Conversations & Conversations for Design Sep 2011 All Oracle UX Blogs Aug 2011 Farewell Loriot Aug 2011 Oracle VDI 3.3 Overview Aug 2011 Sutherland's Closing Remarks at HyperKult Aug 2011 Surface and Subface Aug 2011 Back to Childhood in UI Design Jul 2011 The Art of Engineering and The Engineering of Art Jul 2011 Oracle VDI Seminar - June-30 Jun 2011 SGD White Paper May 2011 TEDxHamburg Live Feed May 2011 Oracle VDI in 3 Minutes May 2011 Space Ship Earth 2011 May 2011 blog moving times Apr 2011 Frozen tag cloud Apr 2011 Oracle: Hardware Software Complete in 1953 Apr 2011 Interaction Design with Wireframes Apr 2011 A guide to closing down a project Feb 2011 Oracle VDI 3.2.2 Jan 2011 free VDI charts Jan 2011 Sun Founders Panel 2006 Dec 2010 Sutherland on Leadership Dec 2010 SocialChat: Efficiency of E20 Dec 2010 ALWAYS ON Desktop Virtualization Nov 2010 12,000 Desktops at JavaOne Nov 2010 SocialChat on Sharing Best Practices Oct 2010 Globe of Visitors Oct 2010 SocialChat about the Next Big Thing Oct 2010 Oracle VDI UX Story - Wireframes Oct 2010 What's a PC anyway? Oct 2010 SocialChat on Getting Things Done Oct 2010 SocialChat on Infoglut Oct 2010 IT Twenty Twenty Oct 2010 Desktop Virtualization Webcasts from OOW Oct 2010 Oracle VDI 3.2 Overview Sep 2010 Blog Usability Top 7 Sep 2010 100 and counting Aug 2010 Oracle'izing the VDI Blogs Aug 2010 SocialChat on Apple Aug 2010 SocialChat on Video Conferencing Aug 2010 Oracle VDI 3.2 - Features and Screenshots Aug 2010 SocialChat: Don't stop making waves Aug 2010 SocialChat: Giving Back to the Community Aug 2010 SocialChat on Learning in Meetings Aug 2010 iPAD's Natural User Interface Jul 2010 Last day for Sun Microsystems GmbH Jun 2010 SirValUse Celebration Snippets Jun 2010 10 years SirValUse - Happy Birthday! Jun 2010 Wim on Virtualization May 2010 New Home for Oracle VDI Apr 2010 Renaissance Slide Sorter Comments Apr 2010 Unboxing Sun Ray 3 Plus Apr 2010 Desktop Virtualisierung mit Sun VDI 3.1 Apr 2010 Blog Relaunch Mar 2010 Social Messaging Slides from CeBIT Mar 2010 Social Messaging Talk at CeBIT Feb 2010 Welcome Oracle Jan 2010 My last presentation at Sun Jan 2010 Ivan Sutherland on Leadership Jan 2010 Learning French with Sun VDI Jan 2010 Learning Danish with Sun Ray Jan 2010 VDI workshop in Nieuwegein Jan 2010 Happy New Year 2010 Jan 2010 On Creating Slides Dec 2009 Best VDI Ever Nov 2009 How to store the Big Bang Nov 2009 Social Enterprise Tools. Beipiel Sun. Nov 2009 Nov-19 Nov 2009 PDF and ODF links on your blog Nov 2009 Q&A on VDI and MySQL Cluster Nov 2009 Zürich next week: Swiss Intranet Summit 09 Nov 2009 Designing for a Sustainable World - World Usabiltiy Day, Nov-12 Nov 2009 How to export a desktop from VDI 3 Nov 2009 Virtualisation Roadshow in the UK Nov 2009 Project Wonderland at EDUCAUSE 09 Nov 2009 VDI Roadshow in Dublin, Nov-26, 2009 Nov 2009 Sun VDI at EDUCAUSE 09 Nov 2009 Sun VDI 3.1 Architecture and New Features Oct 2009 VDI 3.1 is Early-Access Sep 2009 Virtualization for MySQL on VMware Sep 2009 Silpion & 13. Stock Sommerparty Sep 2009 Sun Ray and VMware View 3.1.1 2009-08-31 New Set of Sun Ray Status Icons 2009-08-25 Virtualizing the VDI Core? 2009-08-23 World Usability Day Hamburg 2009 - CfP 2009-07-16 Rising Sun 2009-07-15 featuring twittermeme 2009-06-19 ISC09 Student Party on June-20 /Hamburg 2009-06-18 Before and behind the curtain of JavaOne 2009-06-09 20k desktops at JavaOne 2009-06-01 sweet microblogging 2009-05-25 VDI 3 - Why you need 3 VDI hosts and what you can do about that? 2009-05-21 IA Konferenz 2009 2009-05-20 Sun VDI 3 UX Story - Power of the Web 2009-05-06 Planet of Sun and Oracle User Experience Design 2009-04-22 Sun VDI 3 UX Story - User Research 2009-04-08 Sun VDI 3 UX Story - Concept Workshops 2009-04-06 Localized documentation for Sun Ray Connector for VMware View Manager 1.1 2009-04-03 Sun VDI 3 Press Release 2009-03-25 Sun VDI 3 launches today! 2009-03-25 Sun Ray Connector for VMware View Manager 1.1 Update 2009-03-11 desktop virtualization wiki relaunch 2009-03-06 VDI 3 at CeBIT hall 6, booth E36 2009-03-02 Keyboard layout problems with Sun Ray Connector for VMware VDM 2009-02-23 wikis.sun.com tips & tricks 2009-02-23 Sun VDI 3 is in Early Access 2009-02-09 VirtualCenter unable to decrypt passwords 2009-02-02 Sun & VMware Desktop Training 2009-01-30 VDI at next09? 2009-01-16 Sun VDI: How to use virtual machines with multiple network adapters 2009-01-07 Sun Ray and VMware View 2009-01-07 Hamburg World Usability Day 2008 - Webcasts 2009-01-06 Sun Ray Connector for VMware VDM slides 2008-12-15 mother of all demos 2008-12-08 Build your own Thumper 2008-12-03 Troubleshooting Sun Ray Connector for VMware VDM 2008-12-02 My Roller Tag Cloud 2008-11-28 Sun Ray Connector: SSL connection to VDM 2008-11-25 Setting up SSL and Sun Ray Connector for VMware VDM 2008-11-13 Inspiration for Today and Tomorrow 2008-10-23 Sun Ray Connector for VMware VDM released 2008-10-14 From Sketchpad to ILoveSketch 2008-10-09 Desktop Virtualization on Xing 2008-10-06 User Experience Forum on Xing 2008-10-06 Sun Ray Connector for VMware VDM certified 2008-09-17 Virtual Clouds over Las Vegas 2008-09-14 Bill Verplank sketches metaphors 2008-09-04 End of Early Access - Sun Ray Connector for VMware 2008-08-27 Early Access: Sun Ray Connector for VMware Virtual Desktop Manager 2008-08-12 Sun Virtual Desktop Connector - Insides on Recycling Part 2 2008-07-20 Sun Virtual Desktop Connector - Insides on Recycling Part 3 2008-07-20 Sun Virtual Desktop Connector - Insides on Recycling 2008-07-20 lost in wiki space 2008-07-07 Evolution of the Desktop 2008-06-17 Virtual Desktop Webcast 2008-06-16 Woodstock 2008-06-16 What's a Desktop PC anyway? 2008-06-09 Virtual-T-Box 2008-06-05 Virtualization Glossary 2008-05-06 Five User Experience Principles 2008-04-25 Virtualization News Feed 2008-04-21 Acetylcholinesterase - Second Season 2008-04-18 Acetylcholinesterase - End of Signal 2007-12-31 Produkt-Management ist... 2007-10-22 Usability Verbände, Verteiler und Netzwerke. 2007-10-02 The Meaning is the Message 2007-09-28 Visualization Methods 2007-09-10 Inhouse und Open Source Projekte – Usability verankern und Synergien nutzen 2007-09-03 Der Schwabe Darth Vader entdeckt das Virale Marketing 2007-08-29 Dick Hardt 3.0 on Identity 2.0 2007-08-27 quality of written text depends on the tool 2007-07-27 podcasts for reboot9 2007-06-04 It is the user's itch that need to be scratched 2007-05-25 A duel at reboot9 2007-05-14 Taxonomien und Folksonomien - Tagging als neues HCI-Element 2007-05-10 Dueling Interaction Models of Personal-Computing and Web-Computing 2007-03-01 22.März: Weizenbaum. Rebel at Work. /Filmpremiere Hamburg 2007-02-25 Bruce Sterling at UbiComp 2006 /webcast 2006-11-12 FSOSS 2006 /webcasts 2006-11-10 Highway 101 2006-11-09 User Experience Roundtable Hamburg: EuroGEL 2006 2006-11-08 Douglas Adams' Hyperland (BBC 1990) 2006-10-08 Taxonomien und Folksonomien – Tagging als neues HCI-Element 2006-09-13 Usability im Unternehmen 2006-09-13 Doug does HyperScope 2006-08-26 TED Talks and TechTalks 2006-08-21 Kai Krause über seine Freundschaft zu Douglas Adams 2006-07-20 Rebel At Work: Film Portrait on Weizenbaum 2006-07-04 Gabriele Fischer, mp3 2006-06-07 Dick Hardt at ETech 06 2006-06-05 Weinberger: From Control to Conversation 2006-04-16 Eye Tracking at User Experience Roundtable Hamburg 2006-04-14 dropping knowledge 2006-04-09 GEL 2005 2006-03-13 slide photos of reboot7 2006-03-04 Dick Hardt on Identity 2.0 2006-02-28 User Experience Newsletter #13: Versioning 2006-02-03 Ester Dyson on Choice and Happyness 2006-02-02 Requirements-Engineering im Spannungsfeld von Individual- und Produktsoftware 2006-01-15 User Experience Newsletter #12: Intuition Quiz 2005-11-30 User Experience und Requirements-Engineering für Software-Projekte 2005-10-31 Ivan Sutherland on "Research and Fun" 2005-10-18 Ars Electronica / Mensch und Computer 2005 2005-09-14 60 Jahre nach Memex: Über die Unvereinbarkeit von Desktop- und Web-Paradigma 2005-08-31 reboot 7 2005-06-30

Read the article
B2B and B2C alike… but a little different – Oracle Commerce named Leader in Forrester B2B Commerce Wave

- by Katrina Gosek

We weren’t surprised to see Oracle Commerce positioned as a Leader in Forrester Research, Inc.’s first Commerce Wave focused on B2B, “The Forrester Wave™: B2B Commerce Suites, Q4 2013,” released earlier this month. We believe that the report validates much of what we’ve heard from our largest customers – the world’s largest distribution, manufacturing and high-tech customers who sell billions of dollars of goods and services to other businesses through their Web channels. More importantly, we feel that the report confirms something very important: B2B and B2C Commerce are alike… but a little different. B2B and B2C Commerce are alike… Clearly, B2C experiences have set expectations for B2B. Every B2B buyer is a consumer at home and brings the same expectations to a website selling electronic components, aftermarket parts, or MRO products. Forrester calls these rich consumer-based capabilities that help B2B customers do their jobs “table stakes”: front-office content, community, and commerce features that meet customer expectations for 24x7x365 ordering, real-time customer service, and expedited shipping — both online and on mobile devices: “Whether they are just beginning to sell online or are in the late stages of launching a next-generation site, B2B eCommerce operations today must: offer a customer experience standard comparable to what leading b2c sites now offer; address the growing influence that mobile devices are having in the workplace; make a qualitative and quantitative business case that drives sustained investment.” Just five years ago, many of our B2B customers’ online business comprised only 5-10% of their total revenue. Today, when we speak to those same brands, we hear about double and triple digit growth in their online channels. Many have seen the percentage of the business they perform in their web channels cross the 30-50% threshold. You can hear first-hand from several Oracle Commerce B2B customers about the success they are seeing, and what they’re trying to accomplish (Carolina Biological, Premier Farnell, DeliXL, Elsevier). It seems that this market momentum is likely the reason Forrester broke out the separate B2B Commerce Wave from the B2C Wave. In fact, B2B is becoming the larger force in commerce, expected to collect twice the online dollars of B2C this year ($559 billion). But a little different… Despite the similarities, there is a key and very important difference between B2C and B2B. Unlike a consumer shopping for shoes, a business shopper buying from a distributor or manufacturer is coming to the Web channel as a part of their job. So in addition to a rich, consumer-like experience this shopper expects, these B2B buyers need quoting tools and complex pricing capabilities, like eProcurement, bulk order entry, and other self-service tools such as account, contract and organization management. Forrester also is emphasizing three additional “back-end” tools and capabilities their clients say they need to drive growth in their B2B online channels: i) product information management (PIM), which provides a single system of record for large part lists and product catalogs; ii) web content management (WCM), needed to manage large volumes of unstructured marketing information, and iii) order management systems (OMS), which manage and orchestrate the complex B2B order life cycle from quote through approval, submission to manufacturing, distribution and delivery. We would like to expand on each of these 3 areas: As Forrester suggests, back-end PIM is definitely needed by B2B Commerce providers. Most B2B companies have made significant investments in enterprise-grade PIMs, given the importance of product data management for aggregation and syndication of content, product attribution, analytics, and handling of complex workflows. While in principle it may sound appealing to have a PIM as part of a commerce offering (especially for SMBs who have to do more with less), our customers have typically found that PIM in a commerce platform is largely redundant with what they already have in-place, and is not fully-featured or robust enough to handle the complexity of the product data sets that B2B distributors and manufacturers usually handle. To meet the PIM needs for commerce, Oracle offers enterprise PIM (Product Hub/Fusion PIM) and a robust enterprise data quality product (EDQP) integrated with the Oracle Commerce solution. These are key differentiators of our offering and these capabilities are becoming even more tightly integrated with Oracle Commerce over time. For Commerce, what customers really need is a robust product catalog and content management system for enabling business users to further enrich and ready catalog and content data to be presented and sold online. This has been a significant area of investment in the Oracle Commerce platform , which continue to get stronger. We see this combination of capabilities as best meeting the needs of our customers for a commerce platform without adding a largely redundant, less functional PIM in the commerce front-end. On the topic of web content management, we were pleased to see Forrester cite Oracle’s differentiated digital experience capability in this area and the “unique opportunity in the market to lead the convergence of commerce and content management with the amalgamation of Oracle Commerce with WebCenter Sites (formally FatWire).” Strong content management capabilities are critical for distributors and manufacturers who are frequently serving an engineering audience coming to their websites to conduct product research in search of technical data sheets, drawings, videos and more. The convergence of content, commerce, and experience is critical for B2B brands selling online. Regarding order management, Forrester notes that many businesses use their existing back-end enterprise resource planning (ERP) systems to manage order life cycles. We hear the same from most of our B2B customers, as they already have an ERP system—if not several of them—and are not interested in yet another one. So what do we take away from the Wave results? Forrester notes that the Oracle Commerce Platform “has always had strong B2B commerce capabilities and Oracle certainly has an exhaustive list of B2B customers using the solution.” What makes us excited about developing leading B2B solutions are the close relationships with our customers and the clear opportunity in the market – which we'll address in an exciting new release planned for the next 12 months. Oracle has one of the world’s largest B2B customer bases, providing leading solutions across key business-to-business functions – from marketing, sales automation, and service to master data management, and ERP. To learn more about Oracle’s Commerce product vision and strategy, visit our website and check out these other B2B Commerce Resources: - 2013 B2B Commerce Trends Report - B2B Commerce Whitepaper: Consumerization, Complexity, Change - B2B Commerce Webcast: What Industry Trend Setters Do Right - Internet Retailer, Web Drives Sales for B2B Companies - Internet Retailer Article, The Web Means Business: B2B Companies Beef Up Their Websites, borrowing from b2c retailers and breaking new ground - Internet Retailer Article, B2B e-Commerce is poised for growth

Read the article
Fun with Aggregates

- by Paul White

There are interesting things to be learned from even the simplest queries. For example, imagine you are given the task of writing a query to list AdventureWorks product names where the product has at least one entry in the transaction history table, but fewer than ten. One possible query to meet that specification is: SELECT p.Name FROM Production.Product AS p JOIN Production.TransactionHistory AS th ON p.ProductID = th.ProductID GROUP BY p.ProductID, p.Name HAVING COUNT_BIG(*) < 10; That query correctly returns 23 rows (execution plan and data sample shown below): The execution plan looks a bit different from the written form of the query: the base tables are accessed in reverse order, and the aggregation is performed before the join. The general idea is to read all rows from the history table, compute the count of rows grouped by ProductID, merge join the results to the Product table on ProductID, and finally filter to only return rows where the count is less than ten. This ‘fully-optimized’ plan has an estimated cost of around 0.33 units. The reason for the quote marks there is that this plan is not quite as optimal as it could be – surely it would make sense to push the Filter down past the join too? To answer that, let’s look at some other ways to formulate this query. This being SQL, there are any number of ways to write logically-equivalent query specifications, so we’ll just look at a couple of interesting ones. The first query is an attempt to reverse-engineer T-SQL from the optimized query plan shown above. It joins the result of pre-aggregating the history table to the Product table before filtering: SELECT p.Name FROM ( SELECT th.ProductID, cnt = COUNT_BIG(*) FROM Production.TransactionHistory AS th GROUP BY th.ProductID ) AS q1 JOIN Production.Product AS p ON p.ProductID = q1.ProductID WHERE q1.cnt < 10; Perhaps a little surprisingly, we get a slightly different execution plan: The results are the same (23 rows) but this time the Filter is pushed below the join! The optimizer chooses nested loops for the join, because the cardinality estimate for rows passing the Filter is a bit low (estimate 1 versus 23 actual), though you can force a merge join with a hint and the Filter still appears below the join. In yet another variation, the < 10 predicate can be ‘manually pushed’ by specifying it in a HAVING clause in the “q1” sub-query instead of in the WHERE clause as written above. The reason this predicate can be pushed past the join in this query form, but not in the original formulation is simply an optimizer limitation – it does make efforts (primarily during the simplification phase) to encourage logically-equivalent query specifications to produce the same execution plan, but the implementation is not completely comprehensive. Moving on to a second example, the following query specification results from phrasing the requirement as “list the products where there exists fewer than ten correlated rows in the history table”: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID HAVING COUNT_BIG(*) < 10 ); Unfortunately, this query produces an incorrect result (86 rows): The problem is that it lists products with no history rows, though the reasons are interesting. The COUNT_BIG(*) in the EXISTS clause is a scalar aggregate (meaning there is no GROUP BY clause) and scalar aggregates always produce a value, even when the input is an empty set. In the case of the COUNT aggregate, the result of aggregating the empty set is zero (the other standard aggregates produce a NULL). To make the point really clear, let’s look at product 709, which happens to be one for which no history rows exist: -- Scalar aggregate SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = 709; -- Vector aggregate SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = 709 GROUP BY th.ProductID; The estimated execution plans for these two statements are almost identical: You might expect the Stream Aggregate to have a Group By for the second statement, but this is not the case. The query includes an equality comparison to a constant value (709), so all qualified rows are guaranteed to have the same value for ProductID and the Group By is optimized away. In fact there are some minor differences between the two plans (the first is auto-parameterized and qualifies for trivial plan, whereas the second is not auto-parameterized and requires cost-based optimization), but there is nothing to indicate that one is a scalar aggregate and the other is a vector aggregate. This is something I would like to see exposed in show plan so I suggested it on Connect. Anyway, the results of running the two queries show the difference at runtime: The scalar aggregate (no GROUP BY) returns a result of zero, whereas the vector aggregate (with a GROUP BY clause) returns nothing at all. Returning to our EXISTS query, we could ‘fix’ it by changing the HAVING clause to reject rows where the scalar aggregate returns zero: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID HAVING COUNT_BIG(*) BETWEEN 1 AND 9 ); The query now returns the correct 23 rows: Unfortunately, the execution plan is less efficient now – it has an estimated cost of 0.78 compared to 0.33 for the earlier plans. Let’s try adding a redundant GROUP BY instead of changing the HAVING clause: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY th.ProductID HAVING COUNT_BIG(*) < 10 ); Not only do we now get correct results (23 rows), this is the execution plan: I like to compare that plan to quantum physics: if you don’t find it shocking, you haven’t understood it properly :) The simple addition of a redundant GROUP BY has resulted in the EXISTS form of the query being transformed into exactly the same optimal plan we found earlier. What’s more, in SQL Server 2008 and later, we can replace the odd-looking GROUP BY with an explicit GROUP BY on the empty set: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () HAVING COUNT_BIG(*) < 10 ); I offer that as an alternative because some people find it more intuitive (and it perhaps has more geek value too). Whichever way you prefer, it’s rather satisfying to note that the result of the sub-query does not exist for a particular correlated value where a vector aggregate is used (the scalar COUNT aggregate always returns a value, even if zero, so it always ‘EXISTS’ regardless which ProductID is logically being evaluated). The following query forms also produce the optimal plan and correct results, so long as a vector aggregate is used (you can probably find more equivalent query forms): WHERE Clause SELECT p.Name FROM Production.Product AS p WHERE ( SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () ) < 10; APPLY SELECT p.Name FROM Production.Product AS p CROSS APPLY ( SELECT NULL FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () HAVING COUNT_BIG(*) < 10 ) AS ca (dummy); FROM Clause SELECT q1.Name FROM ( SELECT p.Name, cnt = ( SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () ) FROM Production.Product AS p ) AS q1 WHERE q1.cnt < 10; This last example uses SUM(1) instead of COUNT and does not require a vector aggregate…you should be able to work out why :) SELECT q.Name FROM ( SELECT p.Name, cnt = ( SELECT SUM(1) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID ) FROM Production.Product AS p ) AS q WHERE q.cnt < 10; The semantics of SQL aggregates are rather odd in places. It definitely pays to get to know the rules, and to be careful to check whether your queries are using scalar or vector aggregates. As we have seen, query plans do not show in which ‘mode’ an aggregate is running and getting it wrong can cause poor performance, wrong results, or both. © 2012 Paul White Twitter: @SQL_Kiwi email: [email protected]

Read the article
SQL SERVER – Weekly Series – Memory Lane – #032

- by Pinal Dave

Here is the list of selected articles of SQLAuthority.com across all these years. Instead of just listing all the articles I have selected a few of my most favorite articles and have listed them here with additional notes below it. Let me know which one of the following is your favorite article from memory lane. 2007 Complete Series of Database Coding Standards and Guidelines SQL SERVER Database Coding Standards and Guidelines – Introduction SQL SERVER – Database Coding Standards and Guidelines – Part 1 SQL SERVER – Database Coding Standards and Guidelines – Part 2 SQL SERVER Database Coding Standards and Guidelines Complete List Download Explanation and Example – SELF JOIN When all of the data you require is contained within a single table, but data needed to extract is related to each other in the table itself. Examples of this type of data relate to Employee information, where the table may have both an Employee’s ID number for each record and also a field that displays the ID number of an Employee’s supervisor or manager. To retrieve the data tables are required to relate/join to itself. Insert Multiple Records Using One Insert Statement – Use of UNION ALL This is very interesting question I have received from new developer. How can I insert multiple values in table using only one insert? Now this is interesting question. When there are multiple records are to be inserted in the table following is the common way using T-SQL. Function to Display Current Week Date and Day – Weekly Calendar Straight blog post with script to find current week date and day based on the parameters passed in the function. 2008 In my beginning years, I have almost same confusion as many of the developer had in their earlier years. Here are two of the interesting question which I have attempted to answer in my early year. Even if you are experienced developer may be you will still like to read following two questions: Order Of Column In Index Order of Conditions in WHERE Clauses Example of DISTINCT in Aggregate Functions Have you ever used DISTINCT with the Aggregation Function? Here is a simple example about how users can do it. Create a Comma Delimited List Using SELECT Clause From Table Column Straight to script example where I explained how to do something easy and quickly. Compound Assignment Operators SQL SERVER 2008 has introduced new concept of Compound Assignment Operators. Compound Assignment Operators are available in many other programming languages for quite some time. Compound Assignment Operators is operator where variables are operated upon and assigned on the same line. PIVOT and UNPIVOT Table Examples Here is a very interesting question – the answer to the question can be YES or NO both. “If we PIVOT any table and UNPIVOT that table do we get our original table?” Read the blog post to get the explanation of the question above. 2009 What is Interim Table – Simple Definition of Interim Table The interim table is a table that is generated by joining two tables and not the final result table. In other words, when two tables are joined they create an interim table as resultset but the resultset is not final yet. It may be possible that more tables are about to join on the interim table, and more operations are still to be applied on that table (e.g. Order By, Having etc). Besides, it may be possible that there is no interim table; sometimes final table is what is generated when the query is run. 2010 Stored Procedure and Transactions If Stored Procedure is transactional then, it should roll back complete transactions when it encounters any errors. Well, that does not happen in this case, which proves that Stored Procedure does not only provide just the transactional feature to a batch of T-SQL. Generate Database Script for SQL Azure When talking about SQL Azure the most common complaint I hear is that the script generated from stand-along SQL Server database is not compatible with SQL Azure. This was true for some time for sure but not any more. If you have SQL Server 2008 R2 installed you can follow the guideline below to generate a script which is compatible with SQL Azure. Convert IN to EXISTS – Performance Talk It is NOT necessary that every time when IN is replaced by EXISTS it gives better performance. However, in our case listed above it does for sure give better performance. You can read about this subject in the associated blog post. Subquery or Join – Various Options – SQL Server Engine Knows the Best Every single time whenever there is a performance tuning exercise, I hear the conversation from developer where some prefer subquery and some prefer join. In this two part blog post, I explain the same in the detail with examples. Part 1 | Part 2 Merge Operations – Insert, Update, Delete in Single Execution MERGE is a new feature that provides an efficient way to do multiple DML operations. In earlier versions of SQL Server, we had to write separate statements to INSERT, UPDATE, or DELETE data based on certain conditions; however, at present, by using the MERGE statement, we can include the logic of such data changes in one statement that even checks when the data is matched and then just update it, and similarly, when the data is unmatched, it is inserted. 2011 Puzzle – Statistics are not updated but are Created Once Here is the quick scenario about my setup. Create Table Insert 1000 Records Check the Statistics Now insert 10 times more 10,000 indexes Check the Statistics – it will be NOT updated – WHY? Question to You – When to use Function and When to use Stored Procedure Personally, I believe that they are both different things - they cannot be compared. I can say, it will be like comparing apples and oranges. Each has its own unique use. However, they can be used interchangeably at many times and in real life (i.e., production environment). I have personally seen both of these being used interchangeably many times. This is the precise reason for asking this question. 2012 In year 2012 I had two interesting series ran on the blog. If there is no fun in learning, the learning becomes a burden. For the same reason, I had decided to build a three part quiz around SEQUENCE. The quiz was to identify the next value of the sequence. I encourage all of you to take part in this fun quiz. Guess the Next Value – Puzzle 1 Guess the Next Value – Puzzle 2 Guess the Next Value – Puzzle 3 Guess the Next Value – Puzzle 4 Simple Example to Configure Resource Governor – Introduction to Resource Governor Resource Governor is a feature which can manage SQL Server Workload and System Resource Consumption. We can limit the amount of CPU and memory consumption by limiting /governing /throttling on the SQL Server. If there are different workloads running on SQL Server and each of the workload needs different resources or when workloads are competing for resources with each other and affecting the performance of the whole server resource governor is a very important task. Tricks to Replace SELECT * with Column Names – SQL in Sixty Seconds #017 – Video Retrieves unnecessary columns and increases network traffic When a new columns are added views needs to be refreshed manually Leads to usage of sub-optimal execution plan Uses clustered index in most of the cases instead of using optimal index It is difficult to debug SQL SERVER – Load Generator – Free Tool From CodePlex The best part of this SQL Server Load Generator is that users can run multiple simultaneous queries again SQL Server using different login account and different application name. The interface of the tool is extremely easy to use and very intuitive as well. A Puzzle – Swap Value of Column Without Case Statement Let us assume there is a single column in the table called Gender. The challenge is to write a single update statement which will flip or swap the value in the column. For example if the value in the gender column is ‘male’ swap it with ‘female’ and if the value is ‘female’ swap it with ‘male’. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Memory Lane, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
Profiling Startup Of VS2012 – SpeedTrace Profiler

- by Alois Kraus

SpeedTrace is a relatively unknown profiler made a company called Ipcas. A single professional license does cost 449€+VAT. For the test I did use SpeedTrace 4.5 which is currently Beta. Although it is cheaper than dotTrace it has by far the most options to influence how profiling does work. First you need to create a tracing project which does configure tracing for one process type. You can start the application directly from the profiler or (much more interesting) it does attach to a specific process when it is started. For this you need to check “Trace the specified …” radio button and enter the process name in the “Process Name of the Trace” edit box. You can even selectively enable tracing for processes with a specific command line. Then you need to activate the trace project by pressing the Activate Project button and you are ready to start VS as usual. If you want to profile the next 10 VS instances that you start you can set the Number of Processes counter to e.g. 10. This is immensely helpful if you are trying to profile only the next 5 started processes. As you can see there are many more tabs which do allow to influence tracing in a much more sophisticated way. SpeedTrace is the only profiler which does not rely entirely on the profiling Api of .NET. Instead it does modify the IL code (instrumentation on the fly) to write tracing information to disc which can later be analyzed. This approach is not only very fast but it does give you unprecedented analysis capabilities. Once the traces are collected they do show up in your workspace where you can open the trace viewer. I do skip the other windows because this view is by far the most useful one. You can sort the methods not only by Wall Clock time but also by CPU consumption and wait time which none of the other products support in their views at the same time. If you want to optimize for CPU consumption sort by CPU time. If you want to find out where most time is spent you need Clock Total time and Clock Waiting. There you can directly see if the method did take long because it did wait on something or it did really execute stuff that did take so long. Once you have found a method you want to drill deeper you can double click on a method to get to the Caller/Callee view which is similar to the JetBrains Method Grid view. But this time you do see much more. In the middle is the clicked method. Above are the methods that call you and below are the methods that you do directly call. Normally you would then start digging deeper to find the end of the chain where the slow method worth optimizing is located. But there is a shortcut. You can press the magic button to calculate the aggregation of all called methods. This is displayed in the lower left window where you can see each method call and how long it did take. There you can also sort to see if this call stack does only contain methods (e.g. WCF connect calls which you cannot make faster) not worth optimizing. YourKit has a similar feature where it is called Callees List. In the Functions tab you have in the context menu also many other useful analysis options One really outstanding feature is the View Call History Drilldown. When you select this one you get not a sum of all method invocations but a list with the duration of each method call. This is not surprising since SpeedTrace does use tracing to get its timings. There you can get many useful graphs how this method did behave over time. Did it become slower at some point in time or was only the first call slow? The diagrams and the list will tell you that. That is all fine but what should I do when one method call was slow? I want to see from where it was coming from. No problem select the method in the list hit F10 and you get the call stack. This is a life saver if you e.g. search for serialization problems. Today Serializers are used everywhere. You want to find out from where the 5s XmlSerializer.Deserialize call did come from? Hit F10 and you get the call stack which did invoke the 5s Deserialize call. The CPU timeline tab is also useful to find out where long pauses or excessive CPU consumption did happen. Click in the graph to get the Thread Stacks window where you can get a quick overview what all threads were doing at this time. This does look like the Stack Traces feature in YourKit. Only this time you get the last called method first which helps to quickly see what all threads were executing at this moment. YourKit does generate a rather long list which can be hard to go through when you have many threads. The thread list in the middle does not give you call stacks or anything like that but you see which methods were found most often executing code by the profiler which is a good indication for methods consuming most CPU time. This does sound too good to be true? I have not told you the best part yet. The best thing about this profiler is the staff behind it. When I do see a crash or some other odd behavior I send a mail to Ipcas and I do get usually the next day a mail that the problem has been fixed and a download link to the new version. The guys at Ipcas are even so helpful to log in to your machine via a Citrix Client to help you to get started profiling your actual application you want to profile. After a 2h telco I was converted from a hater to a believer of this tool. The fast response time might also have something to do with the fact that they are actively working on 4.5 to get out of the door. But still the support is by far the best I have encountered so far. The only downside is that you should instrument your assemblies including the .NET Framework to get most accurate numbers. You can profile without doing it but then you will see very high JIT times in your process which can severely affect the correctness of the measured timings. If you do not care about exact numbers you can also enable in the main UI in the Data Trace tab logging of method arguments of primitive types. If you need to know what files at which times were opened by your application you can find it out without a debugger. Since SpeedTrace does read huge trace files in its reader you should perhaps use a 64 bit machine to be able to analyze bigger traces as well. The memory consumption of the trace reader is too high for my taste. But they did promise for the next version to come up with something much improved.

Read the article
What's new in Solaris 11.1?

- by Karoly Vegh

Solaris 11.1 is released. This is the first release update since Solaris 11 11/11, the versioning has been changed from MM/YY style to 11.1 highlighting that this is Solaris 11 Update 1. Solaris 11 itself has been great. What's new in Solaris 11.1? Allow me to pick some new features from the What's New PDF that can be found in the official Oracle Solaris 11.1 Documentation. The updates are very numerous, I really can't include all. I. New AI Automated Installer RBAC profiles have been introduced to enable delegation of installation tasks. II. The interactive installer now supports installing the OS to iSCSI targets. III. ASR (Auto Service Request) and OCM (Oracle Configuration Manager) have been enabled by default to proactively provide support information and create service requests to speed up support processes. This is optional and can be disabled but helps a lot in supportcases. For further information, see: http://oracle.com/goto/solarisautoreg IV. The new command svcbundle helps you to create SMF manifests without having to struggle with XML editing. (btw, do you know the interactive editprop subcommand in svccfg? The listprop/setprop subcommands are great for scripting and automating, but for an interactive property editing session try, for example, this: svccfg -s svc:/application/pkg/system-repository:default editprop ) V. pfedit: Ever wondered how to delegate editing permissions to certain files? It is well known "sudo /usr/bin/vi /etc/hosts" is not the right way, for sudo elevates the complete vi process to admin levels, and the user can "break" out of the session as root with simply starting a shell from that vi. Now, the new pfedit command provides a solution exactly to this challenge - an auditable, secure, per-user configurable editing possibility. See the pfedit man page for examples. VI. rsyslog, the popular logging daemon (filters, SSL, formattable output, SQL collect...) has been included in Solaris 11.1 as an alternative to syslog. VII: Zones: Solaris Zones - as a major Solaris differentiator - got lots of love in terms of new features: ZOSS - Zones on Shared Storage: Placing your zones to shared storage (FC, iSCSI) has never been this easy - via zonecfg. parallell updates - with S11's bootenvironments updating zones was no problem and meant no downtime anyway, but still, now you can update them parallelly, a way faster update action if you are running a large number of zones. This is like parallell patching in Solaris 10, but with all the IPS/ZFS/S11 goodness. per-zone fstype statistics: Running zones on a shared filesystems complicate the I/O debugging, since ZFS collects all the random writes and delivers them sequentially to boost performance. Now, over kstat you can find out which zone's I/O has an impact on the other ones, see the examples in the documentation: http://docs.oracle.com/cd/E26502_01/html/E29024/gmheh.html#scrolltoc Zones got RDSv3 protocol support for InfiniBand, and IPoIB support with Crossbow's anet (automatic vnic creation) feature. NUMA I/O support for Zones: customers can now determine the NUMA I/O topology of the system from within zones. VIII: Security got a lot of attention too: Automated security/audit reporting, with builtin reporting templates e.g. for PCI (payment card industry) audits. PAM is now configureable on a per-user basis instead of system wide, allowing different authentication requirements for different users SSH in Solaris 11.1 now supports running in FIPS 140-2 mode, that is, in a U.S. government security accredited fashion. SHA512/224 and SHA512/256 cryptographic hash functions are implemented in a FIPS-compliant way - and on a T4 implemented in silicon! That is, goverment-approved cryptography at HW-speed. Generally, Solaris is currently under evaluation to be both FIPS and Common Criteria certified. IX. Networking, as one of the core strengths of Solaris 11, has been extended with: Data Center Bridging (DCB) - not only setups where network and storage share the same fabric (FCoE, anyone?) can have Quality-of-Service requirements. DCB enables peers to distinguish traffic based on priorities. Your NICs have to support DCB, see the documentation, and additional information on Wikipedia. DataLink MultiPathing, DLMP, enables link aggregation to span across multiple switches, even between those of different vendors. But there are essential differences to the good old bandwidth-aggregating LACP, see the documentation: http://docs.oracle.com/cd/E26502_01/html/E28993/gmdlu.html#scrolltoc VNIC live migration is now supported from one physical NIC to another on-the-fly X. Data management: FedFS, (Federated FileSystem) is new, it relies on Solaris 11's NFS referring mechanism to join separate shares of different NFS servers into a single filesystem namespace. The referring system has been there since S11 11/11, in Solaris 11.1 FedFS uses a LDAP - as the one global nameservice to bind them all. The iSCSI initiator now uses the T4 CPU's HW-implemented CRC32 algorithm - thus improving iSCSI throughput while reducing CPU utilization on a T4 Storage locking improvements are now RAC aware, speeding up throughput with better locking-communication between nodes up to 20%! XI: Kernel performance optimizations: The new Virtual Memory subsystem ("VM2") scales now to 100+ TB Memory ranges. The memory predictor monitors large memory page usage, and adjust memory page sizes to applications' needs OSM, the Optimized Shared Memory allows Oracle DBs' SGA to be resized online XII: The Power Aware Dispatcher in now by default enabled, reducing power consumption of idle CPUs. Also, the LDoms' Power Management policies and the poweradm settings in Solaris 11 OS will cooperate. XIII: x86 boot: upgrade to the (Grand Unified Bootloader) GRUB2. Because grub2 differs in the configuration syntactically from grub1, one shall not edit the new grub configuration (grub.cfg) but use the new bootadm features to update it. GRUB2 adds UEFI support and also support for disks over 2TB. XIV: Improved viewing of per-CPU statistics of mpstat. This one might seem of less importance at first, but nowadays having better sorting/filtering possibilities on a periodically updated mpstat output of 256+ vCPUs can be a blessing. XV: Support for Solaris Cluster 4.1: The What's New document doesn't actually mention this one, since OSC 4.1 has not been released at the time 11.1 was. But since then it is available, and it requires Solaris 11.1. And it's only a "pkg update" away. ...aand I seriously need to stop here. There's a lot I missed, Edge Virtual Bridging, lofi tuning, ZFS sharing and crypto enhancements, USB3.0, pulseaudio, trusted extensions updates, etc - but if I mention all those then I effectively copy the What's New document. Which I recommend reading now anyway, it is a great extract of the 300+ new projects and RFE-followups in S11.1. And this blogpost is a summary of that extract. For closing words, allow me to come back to Request For Enhancements, RFEs. Any customer can request features. Open up a Support Request, explain that this is an RFE, describe the feature you/your company desires to have in S11 implemented. The more SRs are collected for an RFE, the more chance it's got to get implemented. Feel free to provide feedback about the product, as well as about the Solaris 11.1 Documentation using the "Feedback" button there. Both the Solaris engineers and the documentation writers are eager to hear your input.Feel free to comment about this post too. Except that it's too long ;) wbr,charlie

Read the article
Fun with Aggregates

- by Paul White

There are interesting things to be learned from even the simplest queries. For example, imagine you are given the task of writing a query to list AdventureWorks product names where the product has at least one entry in the transaction history table, but fewer than ten. One possible query to meet that specification is: SELECT p.Name FROM Production.Product AS p JOIN Production.TransactionHistory AS th ON p.ProductID = th.ProductID GROUP BY p.ProductID, p.Name HAVING COUNT_BIG(*) < 10; That query correctly returns 23 rows (execution plan and data sample shown below): The execution plan looks a bit different from the written form of the query: the base tables are accessed in reverse order, and the aggregation is performed before the join. The general idea is to read all rows from the history table, compute the count of rows grouped by ProductID, merge join the results to the Product table on ProductID, and finally filter to only return rows where the count is less than ten. This ‘fully-optimized’ plan has an estimated cost of around 0.33 units. The reason for the quote marks there is that this plan is not quite as optimal as it could be – surely it would make sense to push the Filter down past the join too? To answer that, let’s look at some other ways to formulate this query. This being SQL, there are any number of ways to write logically-equivalent query specifications, so we’ll just look at a couple of interesting ones. The first query is an attempt to reverse-engineer T-SQL from the optimized query plan shown above. It joins the result of pre-aggregating the history table to the Product table before filtering: SELECT p.Name FROM ( SELECT th.ProductID, cnt = COUNT_BIG(*) FROM Production.TransactionHistory AS th GROUP BY th.ProductID ) AS q1 JOIN Production.Product AS p ON p.ProductID = q1.ProductID WHERE q1.cnt < 10; Perhaps a little surprisingly, we get a slightly different execution plan: The results are the same (23 rows) but this time the Filter is pushed below the join! The optimizer chooses nested loops for the join, because the cardinality estimate for rows passing the Filter is a bit low (estimate 1 versus 23 actual), though you can force a merge join with a hint and the Filter still appears below the join. In yet another variation, the < 10 predicate can be ‘manually pushed’ by specifying it in a HAVING clause in the “q1” sub-query instead of in the WHERE clause as written above. The reason this predicate can be pushed past the join in this query form, but not in the original formulation is simply an optimizer limitation – it does make efforts (primarily during the simplification phase) to encourage logically-equivalent query specifications to produce the same execution plan, but the implementation is not completely comprehensive. Moving on to a second example, the following query specification results from phrasing the requirement as “list the products where there exists fewer than ten correlated rows in the history table”: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID HAVING COUNT_BIG(*) < 10 ); Unfortunately, this query produces an incorrect result (86 rows): The problem is that it lists products with no history rows, though the reasons are interesting. The COUNT_BIG(*) in the EXISTS clause is a scalar aggregate (meaning there is no GROUP BY clause) and scalar aggregates always produce a value, even when the input is an empty set. In the case of the COUNT aggregate, the result of aggregating the empty set is zero (the other standard aggregates produce a NULL). To make the point really clear, let’s look at product 709, which happens to be one for which no history rows exist: -- Scalar aggregate SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = 709; -- Vector aggregate SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = 709 GROUP BY th.ProductID; The estimated execution plans for these two statements are almost identical: You might expect the Stream Aggregate to have a Group By for the second statement, but this is not the case. The query includes an equality comparison to a constant value (709), so all qualified rows are guaranteed to have the same value for ProductID and the Group By is optimized away. In fact there are some minor differences between the two plans (the first is auto-parameterized and qualifies for trivial plan, whereas the second is not auto-parameterized and requires cost-based optimization), but there is nothing to indicate that one is a scalar aggregate and the other is a vector aggregate. This is something I would like to see exposed in show plan so I suggested it on Connect. Anyway, the results of running the two queries show the difference at runtime: The scalar aggregate (no GROUP BY) returns a result of zero, whereas the vector aggregate (with a GROUP BY clause) returns nothing at all. Returning to our EXISTS query, we could ‘fix’ it by changing the HAVING clause to reject rows where the scalar aggregate returns zero: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID HAVING COUNT_BIG(*) BETWEEN 1 AND 9 ); The query now returns the correct 23 rows: Unfortunately, the execution plan is less efficient now – it has an estimated cost of 0.78 compared to 0.33 for the earlier plans. Let’s try adding a redundant GROUP BY instead of changing the HAVING clause: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY th.ProductID HAVING COUNT_BIG(*) < 10 ); Not only do we now get correct results (23 rows), this is the execution plan: I like to compare that plan to quantum physics: if you don’t find it shocking, you haven’t understood it properly :) The simple addition of a redundant GROUP BY has resulted in the EXISTS form of the query being transformed into exactly the same optimal plan we found earlier. What’s more, in SQL Server 2008 and later, we can replace the odd-looking GROUP BY with an explicit GROUP BY on the empty set: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () HAVING COUNT_BIG(*) < 10 ); I offer that as an alternative because some people find it more intuitive (and it perhaps has more geek value too). Whichever way you prefer, it’s rather satisfying to note that the result of the sub-query does not exist for a particular correlated value where a vector aggregate is used (the scalar COUNT aggregate always returns a value, even if zero, so it always ‘EXISTS’ regardless which ProductID is logically being evaluated). The following query forms also produce the optimal plan and correct results, so long as a vector aggregate is used (you can probably find more equivalent query forms): WHERE Clause SELECT p.Name FROM Production.Product AS p WHERE ( SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () ) < 10; APPLY SELECT p.Name FROM Production.Product AS p CROSS APPLY ( SELECT NULL FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () HAVING COUNT_BIG(*) < 10 ) AS ca (dummy); FROM Clause SELECT q1.Name FROM ( SELECT p.Name, cnt = ( SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () ) FROM Production.Product AS p ) AS q1 WHERE q1.cnt < 10; This last example uses SUM(1) instead of COUNT and does not require a vector aggregate…you should be able to work out why :) SELECT q.Name FROM ( SELECT p.Name, cnt = ( SELECT SUM(1) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID ) FROM Production.Product AS p ) AS q WHERE q.cnt < 10; The semantics of SQL aggregates are rather odd in places. It definitely pays to get to know the rules, and to be careful to check whether your queries are using scalar or vector aggregates. As we have seen, query plans do not show in which ‘mode’ an aggregate is running and getting it wrong can cause poor performance, wrong results, or both. © 2012 Paul White Twitter: @SQL_Kiwi email: [email protected]

Read the article
Building an OpenStack Cloud for Solaris Engineering, Part 1

- by Dave Miner

One of the signature features of the recently-released Solaris 11.2 is the OpenStack cloud computing platform. Over on the Solaris OpenStack blog the development team is publishing lots of details about our version of OpenStack Havana as well as some tips on specific features, and I highly recommend reading those to get a feel for how we've leveraged Solaris's features to build a top-notch cloud platform. In this and some subsequent posts I'm going to look at it from a different perspective, which is that of the enterprise administrator deploying an OpenStack cloud. But this won't be just a theoretical perspective: I've spent the past several months putting together a deployment of OpenStack for use by the Solaris engineering organization, and now that it's in production we'll share how we built it and what we've learned so far.In the Solaris engineering organization we've long had dedicated lab systems dispersed among our various sites and a home-grown reservation tool for developers to reserve those systems; various teams also have private systems for specific testing purposes. But as a developer, it can still be difficult to find systems you need, especially since most Solaris changes require testing on both SPARC and x86 systems before they can be integrated. We've added virtual resources over the years as well in the form of LDOMs and zones (both traditional non-global zones and the new kernel zones). Fundamentally, though, these were all still deployed in the same model: our overworked lab administrators set up pre-configured resources and we then reserve them. Sounds like pretty much every traditional IT shop, right? Which means that there's a lot of opportunity for efficiencies from greater use of virtualization and the self-service style of cloud computing. As we were well into development of OpenStack on Solaris, I was recruited to figure out how we could deploy it to both provide more (and more efficient) development and test resources for the organization as well as a test environment for Solaris OpenStack.At this point, let's acknowledge one fact: deploying OpenStack is hard. It's a very complex piece of software that makes use of sophisticated networking features and runs as a ton of service daemons with myriad configuration files. The web UI, Horizon, doesn't often do a good job of providing detailed errors. Even the command-line clients are not as transparent as you'd like, though at least you can turn on verbose and debug messaging and often get some clues as to what to look for, though it helps if you're good at reading JSON structure dumps. I'd already learned all of this in doing a single-system Grizzly-on-Linux deployment for the development team to reference when they were getting started so I at least came to this job with some appreciation for what I was taking on. The good news is that both we and the community have done a lot to make deployment much easier in the last year; probably the easiest approach is to download the OpenStack Unified Archive from OTN to get your hands on a single-system demonstration environment. I highly recommend getting started with something like it to get some understanding of OpenStack before you embark on a more complex deployment. For some situations, it may in fact be all you ever need. If so, you don't need to read the rest of this series of posts!In the Solaris engineering case, we need a lot more horsepower than a single-system cloud can provide. We need to support both SPARC and x86 VM's, and we have hundreds of developers so we want to be able to scale to support thousands of VM's, though we're going to build to that scale over time, not immediately. We also want to be able to test both Solaris 11 updates and a release such as Solaris 12 that's under development so that we can work out any upgrade issues before release. One thing we don't have is a requirement for extremely high availability, at least at this point. We surely don't want a lot of down time, but we can tolerate scheduled outages and brief (as in an hour or so) unscheduled ones. Thus I didn't need to spend effort on trying to get high availability everywhere.The diagram below shows our initial deployment design. We're using six systems, most of which are x86 because we had more of those immediately available. All of those systems reside on a management VLAN and are connected with a two-way link aggregation of 1 Gb links (we don't yet have 10 Gb switching infrastructure in place, but we'll get there). A separate VLAN provides "public" (as in connected to the rest of Oracle's internal network) addresses, while we use VxLANs for the tenant networks. One system is more or less the control node, providing the MySQL database, RabbitMQ, Keystone, and the Nova API and scheduler as well as the Horizon console. We're curious how this will perform and I anticipate eventually splitting at least the database off to another node to help simplify upgrades, but at our present scale this works.I had a couple of systems with lots of disk space, one of which was already configured as the Automated Installation server for the lab, so it's just providing the Glance image repository for OpenStack. The other node with lots of disks provides Cinder block storage service; we also have a ZFS Storage Appliance that will help back-end Cinder in the near future, I just haven't had time to get it configured in yet.There's a separate system for Neutron, which is our Elastic Virtual Switch controller and handles the routing and NAT for the guests. We don't have any need for firewalling in this deployment so we're not doing so. We presently have only two tenants defined, one for the Solaris organization that's funding this cloud, and a separate tenant for other Oracle organizations that would like to try out OpenStack on Solaris. Each tenant has one VxLAN defined initially, but we can of course add more. Right now we have just a single /24 network for the floating IP's, once we get demand up to where we need more then we'll add them.Finally, we have started with just two compute nodes; one is an x86 system, the other is an LDOM on a SPARC T5-2. We'll be adding more when demand reaches the level where we need them, but as we're still ramping up the user base it's less work to manage fewer nodes until then.My next post will delve into the details of building this OpenStack cloud's infrastructure, including how we're using various Solaris features such as Automated Installation, IPS packaging, SMF, and Puppet to deploy and manage the nodes. After that we'll get into the specifics of configuring and running OpenStack itself.

Read the article
ASP.NET MVC for the php/asp noob

- by dotjosh

I was talking to a friend today, who's foremost a php developer, about his thoughts on Umbraco and he said "Well they're apparently working feverishly on the new version of Umbraco, which will be MVC... which i still don't know what that means, but I know you like it." I ended up giving him a ground up explanation of ASP.NET MVC, so I'm posting this so he can link this to his friends and for anyone else who finds it useful. The whole goal was to be as simple as possible, not being focused on proper syntax. Model-View-Controller (or MVC) is just a pattern that is used for handling UI interaction with your backend. In a typical web app, you can imagine the *M*odel as your database model, the *V*iew as your HTML page, and the *C*ontroller as the class inbetween. MVC handles your web request different than your typical php/asp app.In your php/asp app, your url maps directly to a php/asp file that contains html, mixed with database access code and redirects.In an MVC app, your url route is mapped to a method on a class (the controller). The body of this method can do some database access and THEN decide which *V*iew (html/aspx page) should be displayed; putting the controller in charge and not the view... a clear seperation of concerns that provides better reusibility and generally promotes cleaner code. Mysite.com, a quick example:Let's say you hit the following url in your application: http://www.mysite.com/Product/ShowItem?Id=4 To avoid tedious configuration, MVC uses a lot of conventions by default. For instance, the above url in your app would automatically make MVC search for a .net class with the name "Product" and a method named "ShowItem" based on the pattern of the url. So if you name things properly, your method would automatically be called when you entered the above url. Additionally, it would automatically map/hydrate the "int id" parameter that was in your querystring, matched by name.Product.cspublic class Product : Controller{ public ViewResult ShowItem(int id) { return View(); }} From this point you can write the code in the body of this method to do some database access and then pass a "bag" (also known as the ViewData) of data to your chosen *V*iew (html page) to use for display. The view(html) ONLY needs to be worried about displaying the flattened data that it's been given in the best way it can; this allows the view to be reused throughout your application as *just* a view, and not be coupled to HOW the data for that view get's loaded.. Product.cspublic class Product : Controller{ public ViewResult ShowItem(int id) { var database = new Database(); var item = database.GetItem(id); ViewData["TheItem"] = item; return View(); }} Again by convention, since the class' method name is "ShowItem", it'll search for a view named "ShowItem.aspx" by default, and pass the ViewData bag to it to use. ShowItem.aspx<html> <body> <% var item =(Item)ViewData["TheItem"] %> <h1><%= item.FullProductName %></h1> </body></html> BUT WAIT! WHY DOES MICROSOFT HAVE TO DO THINGS SO DIFFERENTLY!?They aren't... here are some other frameworks you may have heard of that use the same pattern in a their own way: Ruby On Rails Grails Spring MVC Struts Django

Read the article
Python Coding standards vs. productivity

- by Shroatmeister

I work for a large humanitarian organisation, on a project building software that could help save lives in emergencies by speeding up the distribution of food. Many NGOs desperately need our software and we are weeks behind schedule. One thing that worries me in this project is what I think is an excessive focus on coding standards. We write in python/django and use a version of PEP0008, with various modifications e.g. line lengths can go up to 160 chars and all lines should go that long if possible, no blank lines between imports, line wrapping rules that apply only to certain kinds of classes, lots of templates that we must use, even if they aren't the best way to solve a problem etc. etc. One core dev spent a week rewriting a major part of the system to meet the then new coding standards, throwing away several suites of tests in the process, as the rewrite meant they were 'invalid'. We spent two weeks rewriting all the functionality that was lost, and fixing bugs. He is the lead dev and his word carries weight, so he has convinced the project manager that these standards are necessary. The junior devs do as they are told. I sense that the project manager has a strong feeling of cognitive dissonance about all this but nevertheless agrees with it vehemently as he feels unsure what else to do. Today I got in serious trouble because I had forgotten to put some spaces after commas in a keyword argument. I was literally shouted at by two other devs and the project manager during a Skype call. Personally I think coding standards are important but also think that we are wasting a lot of time obsessing with them, and when I verbalized this it provoked rage. I'm seen as a troublemaker in the team, a team that is looking for scapegoats for its failings. Since the introduction of the coding standards, the team's productivity has measurably plummeted, however this only reinforces the obsession, i.e. the lead dev simply blames our non-adherence to standards for the lack of progress. He believes that we can't read each other's code if we don't adhere to the conventions. This is starting to turn sticky. Now I am trying to modify various scripts, autopep8, pep8ify and PythonTidy to try to match the conventions. We also run pep8 against source code but there are so many implicit amendments to our standard that it's hard to track them all. The lead dev simple picks faults that the pep8 script doesn't pick up and shouts at us in the next stand-up meeting. Every week there are new additions to the coding standards that force us to rewrite existing, working, tested code. Thank heavens we still have tests, (I reverted some commits and fixed a bunch of the ones he removed). All the while there is increasing pressure to meet the deadline. I believe a fundamental issue is that the lead dev and another core dev refuse to trust other developers to do their job. But how to deal with that? We can't do our job because we are too busy rewriting everything. I've never encountered this dynamic in a software engineering team. Am I wrong to question their adherence to coding standards? Has anyone else experienced a similar situation and how have they dealt with it successfully? (I'm not looking for a discussion just actual solutions people have found)

Read the article
apt-get upgrade stuck at the same package

- by decibyte

Current status I've started to suspect this is not an Ubuntu issue, but related to the internet connection here at my work. Until I'm sure, Im leaving my question below: Original question I'm stuck, can't upgrade my system. Running sudo apt-get upgrade gives me the following: mmm@alalunga:~$ sudo apt-get upgrade Reading package lists... Done Building dependency tree Reading state information... Done The following packages have been kept back: ginn libgrip0 linux-generic-pae linux-headers-generic-pae linux-image-generic-pae The following packages will be upgraded: apport apport-gtk bind9-host build-essential dhcp3-client dhcp3-common dnsutils eog evince evince-common firefox firefox-branding firefox-dbg firefox-globalmenu firefox-gnome-support firefox-locale-en gimp gimp-data gir1.2-totem-1.0 glib-networking glib-networking-common glib-networking-services gnupg gpgv icedtea-6-jre-cacao icedtea-6-jre-jamvm icedtea-6-plugin icedtea-netx icedtea-netx-common icedtea-plugin isc-dhcp-client isc-dhcp-common libapache2-mod-php5 libart-2.0-2 libbind9-80 libdns81 libevince3-3 libgimp2.0 libisc83 libisccc80 libisccfg82 liblwres80 libssl-dev libssl-doc libssl1.0.0 libtotem0 linux-firmware linux-libc-dev openjdk-6-jre openjdk-6-jre-headless openjdk-6-jre-lib openssl php-pear php5-cli php5-common php5-curl php5-dev php5-gd php5-mysql php5-xsl policykit-1-gnome python-apport python-django python-gst0.10 python-problem-report resolvconf thunderbird thunderbird-globalmenu thunderbird-gnome-support totem totem-common totem-mozilla totem-plugins xserver-xorg-input-synaptics 74 upgraded, 0 newly installed, 0 to remove and 5 not upgraded. Need to get 317 MB/327 MB of archives. After this operation, 1.481 kB of additional disk space will be used. Do you want to continue [Y/n]? Get:1 http://archive.ubuntu.com/ubuntu/ precise-updates/main openjdk-6-jre-headless i386 6b24-1.11.4-1ubuntu0.12.04.1 [27,3 MB] Get:2 http://archive.ubuntu.com/ubuntu/ precise-updates/main openjdk-6-jre-headless i386 6b24-1.11.4-1ubuntu0.12.04.1 [27,3 MB] Get:3 http://archive.ubuntu.com/ubuntu/ precise-updates/main openjdk-6-jre-headless i386 6b24-1.11.4-1ubuntu0.12.04.1 [27,3 MB] Get:4 http://archive.ubuntu.com/ubuntu/ precise-updates/main openjdk-6-jre-headless i386 6b24-1.11.4-1ubuntu0.12.04.1 [27,3 MB] Get:5 http://archive.ubuntu.com/ubuntu/ precise-updates/main openjdk-6-jre-headless i386 6b24-1.11.4-1ubuntu0.12.04.1 [27,3 MB] Get:6 http://archive.ubuntu.com/ubuntu/ precise-updates/main openjdk-6-jre-headless i386 6b24-1.11.4-1ubuntu0.12.04.1 [27,3 MB] Get:7 http://archive.ubuntu.com/ubuntu/ precise-updates/main openjdk-6-jre-headless i386 6b24-1.11.4-1ubuntu0.12.04.1 [27,3 MB] 9% [7 openjdk-6-jre-headless 27,3 MB/27,3 MB 100%] It keeps downloading the package openjdk-6-jre-headless, then does nothing for a while (hanging on what's the last line above), then download the package again. It's at its 13th download attempt at the moment of writing. The actual downloads seem to be done just fine, but whatever it does after downloading seems to be failing. I tried removing openjdk-6, but then it wanted to install openjdk-7 instead, with the same result, hanging at openjdk-7-jre-headless instead. I also tried changing servers from my local (Danish) to the main server. No luck. It's also keeping me from upgrading alle the other packages. What to do? Update After following instructions in the answer by @lpanebr, it is now stuck at the linux-firmware package. So, maybe it's a more general problem than being related to specific package(s)? Although it did download some packages without problems before getting stuck at linux-firmware.

Read the article
Type of computer for a developer on the road

- by nabucosound

Hi developers: I am planning to be traveling through eurasia and asia (russia, china, korea, japan, south east asia...) for a while and, although there are plenty of marvelous things to see and to do, I must keep on working :(. I am a python developer, dedicated mainly to web projects. I use django, sqlite3, browsers, and ocassionaly (only if I have no choice) I install postgres, mysql, apache or any other servers commonly used in the internets. I do my coding on vim, use ssh to connect, lftp to transfer files, IRC, grep/ack... So I spend most of my time in the terminal shells. But I also use IM or Skype to communicate with my clients and peers, as well as some other software (that after all is not mandatory for my day-to-day work). I currently work with a Macbook Pro (3 years old now) and so far I am very happy with the performance. But I don't want to carry it if I am going to be "on transit" for long time, it is simply huge and heavy for what I am planning to load in my rather small backpack (while traveling, less is more, you know). So here I am reading all kind of opinions about netbooks, because at first sight this is the kind of computer I thought I had to choose. I am going to use Linux for it, Microsoft is not my cup of tea and Mac is not available for them, unless I were to buy a Macbook air, something that I won't do because if I am robbed or rain/dust/truck loaders break it I would burst in tears. I am concerned about wifi performance and connectivity, I am going to use one of those linux distros/tools to hack/test on "open" networks (if you know what I mean) in case I am not in a place with real free wifi access and I find myself in an emergency. CPU speed should be acceptable, but since I don't plan to run Photoshop or expensive IDEs, I guess most of the time I won't be overloading the machine. Apart from this, maybe (surely) I am missing other features to consider. With that said (sorry about the length) here it comes my question, raised from a deep ignorance regarding the wars betweeb betbooks vs notebooks (I assume tablet PCs are not for programming yet): If I buy a netbook will I have to throw it away after 1 month on the road and buy a notebook? Or will I be OK? Thanks! Hector Update I have received great feedback so far! I would like to insist on the fact that I will be traveling through many different countries and scenarios. I am sure that while in Japan I will be more than fine with anything related to technology, connectivity, etc. But consider that I will be, for example, on a train through Russia (transsiberian) and will cross Mongolia as well. I will stay in friends' places sometimes, but most of the time I will have to work from hostel rooms, trains, buses, beaches (hey this last one doesn't sound too bad hehe!). I think some of your answers guys seem to focus on the geek part but loose the point of this "on the road" fact. I am very aware and agree that netbooks suck compared to notebooks, but what I am trying to do here is to find a balance and discover your experiences with netbooks to see first hand if a netbook will be a fail in the mid-long term of the trip for my purposes. So I have resumed the main concepts expressed here on this small list, in no particular order: keyboard/touchpad feel: I use vim so no need of moving mouse pointers that much, unless I am browsing the web, but intensive use of keyboard screen real state: again, terminal work for most of the time battery life: I think something very important weight/size: also very important looks not worth stealing it, don't give a shit if it is lost/stolen/broken: this may depend on kind of person, your economy, etc. Also to prevent losing work, I will upload EVERYTHING to the cloud whenever I'll have a chance. wifi: don't want to discover my wifi is one of those that cannot deal with half the routers on this planet or has poor connectivity. Thanks again for your answers and comments!

Read the article
How do I add PHP support to Apache 2 without breaking my current installation?

- by Hobhouse

I run Apache 2 with WSGI (for a Django-app) on a Ubuntu box. I want to use Nagios for server monitoring, and for this purpose it seems I have to add PHP support to Apache. When I installed Apache 2, I did this: apt-get install apache2 apache2.2-common apache2-mpm-worker apache2-threaded-dev libapache2-mod-wsgi python-dev Available modules for apache2 are these: /etc/apache2/mods-available$ ls actions.conf authn_default.load cache.load deflate.conf filter.load mime.conf proxy_ftp.load suexec.load actions.load authn_file.load cern_meta.load deflate.load headers.load mime.load proxy_http.load unique_id.load alias.conf authnz_ldap.load cgi.load dir.conf ident.load mime_magic.conf rewrite.load userdir.conf alias.load authz_dbm.load cgid.conf dir.load imagemap.load mime_magic.load setenvif.conf userdir.load asis.load authz_default.load cgid.load disk_cache.conf include.load negotiation.conf setenvif.load usertrack.load auth_basic.load authz_groupfile.load charset_lite.load disk_cache.load info.conf negotiation.load speling.load version.load auth_digest.load authz_host.load dav.load dump_io.load info.load proxy.conf ssl.conf vhost_alias.load authn_alias.load authz_owner.load dav_fs.conf env.load ldap.load proxy.load ssl.load wsgi.conf authn_anon.load authz_user.load dav_fs.load expires.load log_forensic.load proxy_ajp.load status.conf wsgi.load authn_dbd.load autoindex.conf dav_lock.load ext_filter.load mem_cache.conf proxy_balancer.load status.load authn_dbm.load autoindex.load dbd.load file_cache.load mem_cache.load proxy_connect.load substitute.load What is the best way for me to add PHP support to Apache 2 without breaking my current installation and configuration?

Read the article
How to pass a parameter to jQuery UI dialog event handler?

- by bebraw

I am currently trying to hook up jQuery UI dialog so that I may use it to create new items to my page and to modify ones existing already on the page. I managed in the former. I'm currently struggling in the latter problem, however. I just cannot find a nice way to pass the item to modify to the dialog. Here's some code to illustrate the issue better. Note especially the part marked with XXX. The {{}} parts are derived from Django template syntax: $(".exercise").click(function() { $.post("{{ request.path }}", { action: "create_dialog", exercise_name: $(this).text() }, function(data) { $("#modify_exercise").html(data.content); }, "json" ); $("#modify_exercise").dialog('open'); }); $("#modify_exercise").dialog({ autoOpen: false, resizable: false, modal: true, buttons: { '{% trans 'Modify' %}': function() { var $inputs = $('#modify_exercise :input'); var post_values = {}; $inputs.each(function() { post_values[this.name] = $(this).val(); }); post_values.action = 'validate_form'; //XXX: how to get the exercise name here? post_values.exercise_name = 'foobar'; $.post('{{ request.path }}', post_values, function(data) { if( data.status == 'invalid' ) { $('#modify_exercise').html(data.content); } else { location.reload(); } }, "json" ); } } }); Here's some markup to show how the code relates to the structure: <div id="modify_exercise" class="dialog" title="{% trans 'Modify exercise' %}"> </div> <ul> {% for exercise in exercises %} <li> <a class="exercise" href="#" title="{{ exercise.description }}"> {{ exercise.name }} </a> </li> {% endfor %} </ul>

Read the article
jqgrid not updating data on reload

- by meepmeep

I have a jqgrid with data loading from an xml stream (handled by django 1.1.1): jQuery(document).ready(function(){ jQuery("#list").jqGrid({ url:'/downtime/list_xml/', datatype: 'xml', mtype: 'GET', postData:{site:1,date_start:document.getElementById('datepicker_start').value,date_end:document.getElementById('datepicker_end').value}, colNames:[...], colModel :[...], pager: '#pager', rowNum: 25, rowList:[10,25,50], viewrecords: true, height: 500, caption: 'Click on column headers to reorder' }); $("#grid_reload").click(function(){ $("#list").trigger("reloadGrid"); }); $("#tabs").tabs(); $("#datepicker_start").datepicker({dateFormat: 'yy-mm-dd'}); $("#datepicker_end").datepicker({dateFormat: 'yy-mm-dd'}); ... And the html elements: <th>Start Date:</th> <td><input id="datepicker_start" type="text" value="2009-12-01"></input></td> <th>End Date:</th> <td><input id="datepicker_end" type="text" value="2009-12-03"></input></td> <td><input id="grid_reload" type="submit" value="load" /></td> When I click the grid_reload button, the grid reloads, but when it has done so it shows exactly the same data as before, even though the xml is tested to return different data for different timestamps. I have checked using alert(document.getElementById('datepicker_start').value) that the values in the date inputs are passed correctly when the reload event is triggered. Any ideas why the data doesn't update? A caching or browser issue perhaps?

Read the article
Creating Signed URLs for Amazon CloudFront

- by Zack

Short version: How do I make signed URLs "on-demand" to mimic Nginx's X-Accel-Redirect behavior (i.e. protecting downloads) with Amazon CloudFront/S3 using Python. I've got a Django server up and running with an Nginx front-end. I've been getting hammered with requests to it and recently had to install it as a Tornado WSGI application to prevent it from crashing in FastCGI mode. Now I'm having an issue with my server getting bogged down (i.e. most of its bandwidth is being used up) due to too many requests for media being made to it, I've been looking into CDNs and I believe Amazon CloudFront/S3 would be the proper solution for me. I've been using Nginx's X-Accel-Redirect header to protect the files from unauthorized downloading, but I don't have that ability with CloudFront/S3--however they do offer signed URLs. I'm no Python expert by far and definitely don't know how to create a Signed URL properly, so I was hoping someone would have a link for how to make these URLs "on-demand" or would be willing to explain how to here, it would be greatly appreciated. Also, is this the proper solution, even? I'm not too familiar with CDNs, is there a CDN that would be better suited for this?

Read the article
git push heroku master gives error ssh: connect to host heroku.com port 22: Connection refused

- by user1476508

I'm trying to run the heroku-django tutorial (using ubuntu 12.04) and it seems for some reason i cant push into heroku. here is what happens: yeinhorn@ubuntu:~/hellodjango$ git init Reinitialized existing Git repository in /home/yeinhorn/hellodjango/.git/ yeinhorn@ubuntu:~/hellodjango$ git add . yeinhorn@ubuntu:~/hellodjango$ git commit -m "my first commit" On branch master nothing to commit (working directory clean) yeinhorn@ubuntu:~/hellodjango$ heroku create Creating high-dusk-6308... done, stack is cedar http://high-dusk-6308.herokuapp.com/ | [email protected]:high-dusk-6308.git ! New default stack: Cedar. To use Bamboo, run heroku create -s bamboo. yeinhorn@ubuntu:~/hellodjango$ git remote -v heroku [email protected]:blazing-dusk-8587.git (fetch) heroku [email protected]:blazing-dusk-8587.git (push) yeinhorn@ubuntu:~/hellodjango$ git push heroku master ssh: connect to host heroku.com port 22: Connection refused fatal: The remote end hung up unexpectedly yeinhorn@ubuntu:~/hellodjango$ git push -f heroku ssh: connect to host heroku.com port 22: Connection refused fatal: The remote end hung up unexpectedly also when i run $telnet heroku.com 22 i get Trying 50.19.85.132... Trying 50.19.85.154... Trying 50.19.85.156... telnet: Unable to connect to remote host: Connection refused any ideas?

Read the article
is struts2 Framework dead? if so, which direction to go from here?

- by Omnipresent

I've been working in struts2 framework for the past year and a half. As I look around SO it seems like struts2 questions get the least amount of hits. At first I thought it was because the SO community is not much into struts2, but this is the case in other forums as well. Only the struts2 mailing list is pretty active. I think struts2 is slowly becoming a dinosaur framework and I want to move on. I've dabbled a little with other stuff like RoR, Grails, Django, .NET MVC but I've never really committed and made a pact with myself to sit down and learn these. Maybe its the fact that I dont know the languages that these frameworks are built on. Spending a day or two with RoR makes me want to choke some of the JAR files and struts2 html tags and .NET MVC looks promising but not knowing c# ties me back. My question is, have you been on the same path? Did anyone used to do struts2 before and now changed to something new? What would you suggest I learn after Struts2? And lastly, as a developer how do you motivate yourself to learn a new framework, or a language? for me, just typing the code provided in the book is dead boring.

Read the article
Easy way to observe user activity - how improve my database structure.

- by Thomas

Welcome, I need some advise to improve perfomence my web application. In the begin I had this structure of database: USER -id (Primary Key) -name -password -email .... PROFILE -user Primary Key, Foreign Key (USER) -birthday -region -photoFile ... PAGES -id (Primary Key) -user Foreign Key(USER) -page -date COMMENTS -id (Primary Key) -user Foreign Key(USER) -page Foreign Key(PAGE) -comment -date FAVOURITES_PAGES -id (Primary Key) -user Foreign Key(USER) -favourite_page Foreign Key(PAGE) -date but now one of the most important page of website is observatory, when everyone can observe activity others users. So I need select all pages, comments and favourites pages some users and display it in one list, sorted by date. For better perfomance (I think) I changed my structure to this: table USER and PROFILE without changes ACTIVITY (additional table- have common fields: user,date) -id (Primary Key) -user Foreign Key(USER) -date -page Foreign Key(PAGE) -comment Foreign Key(COMMENTS) -favourite_page Foreign Key(FAVOURITES_PAGES) PAGES -id (Primary Key) -page COMMENTS -id (Primary Key) -page Foreign Key(PAGE) -comment FAVOURITES_PAGES -id (Primary Key) -favourite_page Foreign Key(PAGE) So now it is very easy get sorted records from all tables. But I have no only foreign key to PAGES, COMMENTS and FAVOURITES_PAGES in ACTIVITY table - there is about ten Foreign Key fields and in one record only one have value, others have None: ACTIVITY id user date page comment ... 1 2 2010-02-23 None 1 2 1 2010-02-21 1 None .... It is corect solution. When I display about 40 records in one page (pagination) I must wait about one secound, but database is almost emty (a few users and about 100 records in others tables). It is depends on amount records per page - I have checked it, but why it takes too long time, becouse of relationships? The website is built in Python/Django. Any advices/opinion?

Read the article
Selecting a Java framework for large application w/ only ONE user

- by Bijan

I am building a large application that will be hosted on an AWS server. I'm trying to select a web framework for assisting me with code organization, template design, and generally presentation aspects. Here are some points of consideration: Require security/login/user authentication. I may add the ability in the future to allow more than just an administrator to access the web app, but it is not a public facing website. AJAX support would be helpful. There are a couple widgets that I don't want to recreate. One is a tree object, where the user can expand/contract items in the list, can create new branches, add/edit objects. This would be better off in some dynamic view rather than all done in ugly html. Generally, this is just to provide the application with a face for control, management, and monitoring. Having an easier time adding buttons, CSS, AJAX widgets are great additions though, but not the primary purpose. I'm considering: Wicket Spring Seam GWT Stripe and the list goes on, as I'm sure you all know. I originally planned on using GWT, but then started to feel that GWT didn't cover my primary needs. I could be wrong about this, but there seems to be a lot of support for GWT AND Wicket/Spring. All of this 'getting lost in java frameworks' got me thinking outside the java realm for a framework that would suit my needs that was a clear option, like: JRuby/Rails Jython/Django Groovy/Grails Guice (just throwing this in there... I don't clearly understand the main purposes of all these frameworks. It doesn't seem like DInjection is something I need for a single purpose application) Thanks as always. This community makes Googling for esoteric programming information an order of magnitude better.

Read the article

Search Results

Search found 3821 results on 153 pages for 'django aggregation'.

Page 149/153 | < Previous Page | 145 146 147 148 149 150 151 152 153 | Next Page >

- by Reed

- by Reed

- by JuergenKress

- by charlie.berger

- by Katrina Gosek

- by mprove

- by Katrina Gosek

- by Paul White

- by Pinal Dave

- by Alois Kraus

- by Karoly Vegh

- by Paul White

- by Dave Miner

- by dotjosh

- by Shroatmeister

- by decibyte

- by nabucosound

- by Hobhouse

- by bebraw

- by meepmeep

- by Zack

- by user1476508

- by Omnipresent

- by Thomas

- by Bijan

< Previous Page | 145 146 147 148 149 150 151 152 153 | Next Page >