Search Results

Search found 10235 results on 410 pages for 'inner loop'.

Page 221/410 | < Previous Page | 217 218 219 220 221 222 223 224 225 226 227 228 | Next Page >

Apache SSL reverse proxy to a Embed Tomcat

- by ggarcia24

I'm trying to put in place a reverse proxy for an application that is running a tomcat embed server over SSL. The application needs to run over SSL on the port 9002 so I have no way of "disabling SSL" for this app. The current setup schema looks like this: [192.168.0.10:443 - Apache with mod_proxy] --> [192.168.0.10:9002 - Tomcat App] After googling on how to make such a setup (and testing) I came across this: https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/861137 Which lead to make my current configuration (to try to emulate the --secure-protocol=sslv3 option of wget) /etc/apache2/sites/enabled/default-ssl: <VirtualHost _default_:443> SSLEngine On SSLCertificateFile /etc/ssl/certs/ssl-cert-snakeoil.pem SSLCertificateKeyFile /etc/ssl/private/ssl-cert-snakeoil.key SSLProxyEngine On SSLProxyProtocol SSLv3 SSLProxyCipherSuite SSLv3 ProxyPass /test/ https://192.168.0.10:9002/ ProxyPassReverse /test/ https://192.168.0.10:9002/ LogLevel debug ErrorLog /var/log/apache2/error-ssl.log CustomLog /var/log/apache2/access-ssl.log combined </VirtualHost> The thing is that the error log is showing error:14077102:SSL routines:SSL23_GET_SERVER_HELLO:unsupported protocol Complete request log: [Wed Mar 13 20:05:57 2013] [debug] mod_proxy.c(1020): Running scheme https handler (attempt 0) [Wed Mar 13 20:05:57 2013] [debug] mod_proxy_http.c(1973): proxy: HTTP: serving URL https://192.168.0.10:9002/ [Wed Mar 13 20:05:57 2013] [debug] proxy_util.c(2011): proxy: HTTPS: has acquired connection for (192.168.0.10) [Wed Mar 13 20:05:57 2013] [debug] proxy_util.c(2067): proxy: connecting https://192.168.0.10:9002/ to 192.168.0.10:9002 [Wed Mar 13 20:05:57 2013] [debug] proxy_util.c(2193): proxy: connected / to 192.168.0.10:9002 [Wed Mar 13 20:05:57 2013] [debug] proxy_util.c(2444): proxy: HTTPS: fam 2 socket created to connect to 192.168.0.10 [Wed Mar 13 20:05:57 2013] [debug] proxy_util.c(2576): proxy: HTTPS: connection complete to 192.168.0.10:9002 (192.168.0.10) [Wed Mar 13 20:05:57 2013] [info] [client 192.168.0.10] Connection to child 0 established (server demo1agrubu01.demo.lab:443) [Wed Mar 13 20:05:57 2013] [info] Seeding PRNG with 656 bytes of entropy [Wed Mar 13 20:05:57 2013] [debug] ssl_engine_kernel.c(1866): OpenSSL: Handshake: start [Wed Mar 13 20:05:57 2013] [debug] ssl_engine_kernel.c(1874): OpenSSL: Loop: before/connect initialization [Wed Mar 13 20:05:57 2013] [debug] ssl_engine_kernel.c(1874): OpenSSL: Loop: unknown state [Wed Mar 13 20:05:57 2013] [debug] ssl_engine_io.c(1897): OpenSSL: read 7/7 bytes from BIO#7f122800a100 [mem: 7f1230018f60] (BIO dump follows) [Wed Mar 13 20:05:57 2013] [debug] ssl_engine_io.c(1830): +-------------------------------------------------------------------------+ [Wed Mar 13 20:05:57 2013] [debug] ssl_engine_io.c(1869): | 0000: 15 03 01 00 02 02 50 ......P | [Wed Mar 13 20:05:57 2013] [debug] ssl_engine_io.c(1875): +-------------------------------------------------------------------------+ [Wed Mar 13 20:05:57 2013] [debug] ssl_engine_kernel.c(1903): OpenSSL: Exit: error in unknown state [Wed Mar 13 20:05:57 2013] [info] [client 192.168.0.10] SSL Proxy connect failed [Wed Mar 13 20:05:57 2013] [info] SSL Library Error: 336032002 error:14077102:SSL routines:SSL23_GET_SERVER_HELLO:unsupported protocol [Wed Mar 13 20:05:57 2013] [info] [client 192.168.0.10] Connection closed to child 0 with abortive shutdown (server example1.domain.tld:443) [Wed Mar 13 20:05:57 2013] [error] (502)Unknown error 502: proxy: pass request body failed to 172.31.4.13:9002 (192.168.0.10) [Wed Mar 13 20:05:57 2013] [error] [client 192.168.0.10] proxy: Error during SSL Handshake with remote server returned by /dsfe/ [Wed Mar 13 20:05:57 2013] [error] proxy: pass request body failed to 192.168.0.10:9002 (172.31.4.13) from 172.31.4.13 () [Wed Mar 13 20:05:57 2013] [debug] proxy_util.c(2029): proxy: HTTPS: has released connection for (172.31.4.13) [Wed Mar 13 20:05:57 2013] [debug] ssl_engine_kernel.c(1884): OpenSSL: Write: SSL negotiation finished successfully [Wed Mar 13 20:05:57 2013] [info] [client 192.168.0.10] Connection closed to child 6 with standard shutdown (server example1.domain.tld:443) If I do a wget --secure-protocol=sslv3 --no-check-certificate https://192.168.0.10:9002/ it works perfectly, but from apache is not working. I'm on an Ubuntu Server with the latest updates running apache2 with mod_proxy and mod_ssl enabled: ~$ cat /etc/lsb-release DISTRIB_ID=Ubuntu DISTRIB_RELEASE=12.04 DISTRIB_CODENAME=precise DISTRIB_DESCRIPTION="Ubuntu 12.04.2 LTS" ~# dpkg -s apache2 ... Version: 2.2.22-1ubuntu1.2 ... ~# dpkg -s openssl ... Version: 1.0.1-4ubuntu5.7 ... Hope that anyone may help

Read the article
Parallelism in .NET – Part 5, Partitioning of Work

- by Reed

When parallelizing any routine, we start by decomposing the problem. Once the problem is understood, we need to break our work into separate tasks, so each task can be run on a different processing element. This process is called partitioning. Partitioning our tasks is a challenging feat. There are opposing forces at work here: too many partitions adds overhead, too few partitions leaves processors idle. Trying to work the perfect balance between the two extremes is the goal for which we should aim. Luckily, the Task Parallel Library automatically handles much of this process. However, there are situations where the default partitioning may not be appropriate, and knowledge of our routines may allow us to guide the framework to making better decisions. First off, I’d like to say that this is a more advanced topic. It is perfectly acceptable to use the parallel constructs in the framework without considering the partitioning taking place. The default behavior in the Task Parallel Library is very well-behaved, even for unusual work loads, and should rarely be adjusted. I have found few situations where the default partitioning behavior in the TPL is not as good or better than my own hand-written partitioning routines, and recommend using the defaults unless there is a strong, measured, and profiled reason to avoid using them. However, understanding partitioning, and how the TPL partitions your data, helps in understanding the proper usage of the TPL. I indirectly mentioned partitioning while discussing aggregation. Typically, our systems will have a limited number of Processing Elements (PE), which is the terminology used for hardware capable of processing a stream of instructions. For example, in a standard Intel i7 system, there are four processor cores, each of which has two potential hardware threads due to Hyperthreading. This gives us a total of 8 PEs – theoretically, we can have up to eight operations occurring concurrently within our system. In order to fully exploit this power, we need to partition our work into Tasks. A task is a simple set of instructions that can be run on a PE. Ideally, we want to have at least one task per PE in the system, since fewer tasks means that some of our processing power will be sitting idle. A naive implementation would be to just take our data, and partition it with one element in our collection being treated as one task. When we loop through our collection in parallel, using this approach, we’d just process one item at a time, then reuse that thread to process the next, etc. There’s a flaw in this approach, however. It will tend to be slower than necessary, often slower than processing the data serially. The problem is that there is overhead associated with each task. When we take a simple foreach loop body and implement it using the TPL, we add overhead. First, we change the body from a simple statement to a delegate, which must be invoked. In order to invoke the delegate on a separate thread, the delegate gets added to the ThreadPool’s current work queue, and the ThreadPool must pull this off the queue, assign it to a free thread, then execute it. If our collection had one million elements, the overhead of trying to spawn one million tasks would destroy our performance. The answer, here, is to partition our collection into groups, and have each group of elements treated as a single task. By adding a partitioning step, we can break our total work into small enough tasks to keep our processors busy, but large enough tasks to avoid overburdening the ThreadPool. There are two clear, opposing goals here: Always try to keep each processor working, but also try to keep the individual partitions as large as possible. When using Parallel.For, the partitioning is always handled automatically. At first, partitioning here seems simple. A naive implementation would merely split the total element count up by the number of PEs in the system, and assign a chunk of data to each processor. Many hand-written partitioning schemes work in this exactly manner. This perfectly balanced, static partitioning scheme works very well if the amount of work is constant for each element. However, this is rarely the case. Often, the length of time required to process an element grows as we progress through the collection, especially if we’re doing numerical computations. In this case, the first PEs will finish early, and sit idle waiting on the last chunks to finish. Sometimes, work can decrease as we progress, since previous computations may be used to speed up later computations. In this situation, the first chunks will be working far longer than the last chunks. In order to balance the workload, many implementations create many small chunks, and reuse threads. This adds overhead, but does provide better load balancing, which in turn improves performance. The Task Parallel Library handles this more elaborately. Chunks are determined at runtime, and start small. They grow slowly over time, getting larger and larger. This tends to lead to a near optimum load balancing, even in odd cases such as increasing or decreasing workloads. Parallel.ForEach is a bit more complicated, however. When working with a generic IEnumerable<T>, the number of items required for processing is not known in advance, and must be discovered at runtime. In addition, since we don’t have direct access to each element, the scheduler must enumerate the collection to process it. Since IEnumerable<T> is not thread safe, it must lock on elements as it enumerates, create temporary collections for each chunk to process, and schedule this out. By default, it uses a partitioning method similar to the one described above. We can see this directly by looking at the Visual Partitioning sample shipped by the Task Parallel Library team, and available as part of the Samples for Parallel Programming. When we run the sample, with four cores and the default, Load Balancing partitioning scheme, we see this: The colored bands represent each processing core. You can see that, when we started (at the top), we begin with very small bands of color. As the routine progresses through the Parallel.ForEach, the chunks get larger and larger (seen by larger and larger stripes). Most of the time, this is fantastic behavior, and most likely will out perform any custom written partitioning. However, if your routine is not scaling well, it may be due to a failure in the default partitioning to handle your specific case. With prior knowledge about your work, it may be possible to partition data more meaningfully than the default Partitioner. There is the option to use an overload of Parallel.ForEach which takes a Partitioner<T> instance. The Partitioner<T> class is an abstract class which allows for both static and dynamic partitioning. By overriding Partitioner<T>.SupportsDynamicPartitions, you can specify whether a dynamic approach is available. If not, your custom Partitioner<T> subclass would override GetPartitions(int), which returns a list of IEnumerator<T> instances. These are then used by the Parallel class to split work up amongst processors. When dynamic partitioning is available, GetDynamicPartitions() is used, which returns an IEnumerable<T> for each partition. If you do decide to implement your own Partitioner<T>, keep in mind the goals and tradeoffs of different partitioning strategies, and design appropriately. The Samples for Parallel Programming project includes a ChunkPartitioner class in the ParallelExtensionsExtras project. This provides example code for implementing your own, custom allocation strategies, including a static allocator of a given chunk size. Although implementing your own Partitioner<T> is possible, as I mentioned above, this is rarely required or useful in practice. The default behavior of the TPL is very good, often better than any hand written partitioning strategy.

Read the article
Parallelism in .NET – Part 6, Declarative Data Parallelism

- by Reed

When working with a problem that can be decomposed by data, we have a collection, and some operation being performed upon the collection. I’ve demonstrated how this can be parallelized using the Task Parallel Library and imperative programming using imperative data parallelism via the Parallel class. While this provides a huge step forward in terms of power and capabilities, in many cases, special care must still be given for relative common scenarios. C# 3.0 and Visual Basic 9.0 introduced a new, declarative programming model to .NET via the LINQ Project. When working with collections, we can now write software that describes what we want to occur without having to explicitly state how the program should accomplish the task. By taking advantage of LINQ, many operations become much shorter, more elegant, and easier to understand and maintain. Version 4.0 of the .NET framework extends this concept into the parallel computation space by introducing Parallel LINQ. Before we delve into PLINQ, let’s begin with a short discussion of LINQ. LINQ, the extensions to the .NET Framework which implement language integrated query, set, and transform operations, is implemented in many flavors. For our purposes, we are interested in LINQ to Objects. When dealing with parallelizing a routine, we typically are dealing with in-memory data storage. More data-access oriented LINQ variants, such as LINQ to SQL and LINQ to Entities in the Entity Framework fall outside of our concern, since the parallelism there is the concern of the data base engine processing the query itself. LINQ (LINQ to Objects in particular) works by implementing a series of extension methods, most of which work on IEnumerable<T>. The language enhancements use these extension methods to create a very concise, readable alternative to using traditional foreach statement. For example, let’s revisit our minimum aggregation routine we wrote in Part 4: double min = double.MaxValue; foreach(var item in collection) { double value = item.PerformComputation(); min = System.Math.Min(min, value); } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } Here, we’re doing a very simple computation, but writing this in an imperative style. This can be loosely translated to English as: Create a very large number, and save it in min Loop through each item in the collection. For every item: Perform some computation, and save the result If the computation is less than min, set min to the computation Although this is fairly easy to follow, it’s quite a few lines of code, and it requires us to read through the code, step by step, line by line, in order to understand the intention of the developer. We can rework this same statement, using LINQ: double min = collection.Min(item => item.PerformComputation()); Here, we’re after the same information. However, this is written using a declarative programming style. When we see this code, we’d naturally translate this to English as: Save the Min value of collection, determined via calling item.PerformComputation() That’s it – instead of multiple logical steps, we have one single, declarative request. This makes the developer’s intentions very clear, and very easy to follow. The system is free to implement this using whatever method required. Parallel LINQ (PLINQ) extends LINQ to Objects to support parallel operations. This is a perfect fit in many cases when you have a problem that can be decomposed by data. To show this, let’s again refer to our minimum aggregation routine from Part 4, but this time, let’s review our final, parallelized version: // Safe, and fast! double min = double.MaxValue; // Make a "lock" object object syncObject = new object(); Parallel.ForEach( collection, // First, we provide a local state initialization delegate. () => double.MaxValue, // Next, we supply the body, which takes the original item, loop state, // and local state, and returns a new local state (item, loopState, localState) => { double value = item.PerformComputation(); return System.Math.Min(localState, value); }, // Finally, we provide an Action<TLocal>, to "merge" results together localState => { // This requires locking, but it's only once per used thread lock(syncObj) min = System.Math.Min(min, localState); } ); Here, we’re doing the same computation as above, but fully parallelized. Describing this in English becomes quite a feat: Create a very large number, and save it in min Create a temporary object we can use for locking Call Parallel.ForEach, specifying three delegates For the first delegate: Initialize a local variable to hold the local state to a very large number For the second delegate: For each item in the collection, perform some computation, save the result If the result is less than our local state, save the result in local state For the final delegate: Take a lock on our temporary object to protect our min variable Save the min of our min and local state variables Although this solves our problem, and does it in a very efficient way, we’ve created a set of code that is quite a bit more difficult to understand and maintain. PLINQ provides us with a very nice alternative. In order to use PLINQ, we need to learn one new extension method that works on IEnumerable<T> – ParallelEnumerable.AsParallel(). That’s all we need to learn in order to use PLINQ: one single method. We can write our minimum aggregation in PLINQ very simply: double min = collection.AsParallel().Min(item => item.PerformComputation()); By simply adding “.AsParallel()” to our LINQ to Objects query, we converted this to using PLINQ and running this computation in parallel! This can be loosely translated into English easily, as well: Process the collection in parallel Get the Minimum value, determined by calling PerformComputation on each item Here, our intention is very clear and easy to understand. We just want to perform the same operation we did in serial, but run it “as parallel”. PLINQ completely extends LINQ to Objects: the entire functionality of LINQ to Objects is available. By simply adding a call to AsParallel(), we can specify that a collection should be processed in parallel. This is simple, safe, and incredibly useful.

Read the article
Improving Partitioned Table Join Performance

- by Paul White

The query optimizer does not always choose an optimal strategy when joining partitioned tables. This post looks at an example, showing how a manual rewrite of the query can almost double performance, while reducing the memory grant to almost nothing. Test Data The two tables in this example use a common partitioning partition scheme. The partition function uses 41 equal-size partitions: CREATE PARTITION FUNCTION PFT (integer) AS RANGE RIGHT FOR VALUES ( 125000, 250000, 375000, 500000, 625000, 750000, 875000, 1000000, 1125000, 1250000, 1375000, 1500000, 1625000, 1750000, 1875000, 2000000, 2125000, 2250000, 2375000, 2500000, 2625000, 2750000, 2875000, 3000000, 3125000, 3250000, 3375000, 3500000, 3625000, 3750000, 3875000, 4000000, 4125000, 4250000, 4375000, 4500000, 4625000, 4750000, 4875000, 5000000 ); GO CREATE PARTITION SCHEME PST AS PARTITION PFT ALL TO ([PRIMARY]); There two tables are: CREATE TABLE dbo.T1 ( TID integer NOT NULL IDENTITY(0,1), Column1 integer NOT NULL, Padding binary(100) NOT NULL DEFAULT 0x, CONSTRAINT PK_T1 PRIMARY KEY CLUSTERED (TID) ON PST (TID) ); CREATE TABLE dbo.T2 ( TID integer NOT NULL, Column1 integer NOT NULL, Padding binary(100) NOT NULL DEFAULT 0x, CONSTRAINT PK_T2 PRIMARY KEY CLUSTERED (TID, Column1) ON PST (TID) ); The next script loads 5 million rows into T1 with a pseudo-random value between 1 and 5 for Column1. The table is partitioned on the IDENTITY column TID: INSERT dbo.T1 WITH (TABLOCKX) (Column1) SELECT (ABS(CHECKSUM(NEWID())) % 5) + 1 FROM dbo.Numbers AS N WHERE n BETWEEN 1 AND 5000000; In case you don’t already have an auxiliary table of numbers lying around, here’s a script to create one with 10 million rows: CREATE TABLE dbo.Numbers (n bigint PRIMARY KEY); WITH L0 AS(SELECT 1 AS c UNION ALL SELECT 1), L1 AS(SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B), L2 AS(SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B), L3 AS(SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B), L4 AS(SELECT 1 AS c FROM L3 AS A CROSS JOIN L3 AS B), L5 AS(SELECT 1 AS c FROM L4 AS A CROSS JOIN L4 AS B), Nums AS(SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS n FROM L5) INSERT dbo.Numbers WITH (TABLOCKX) SELECT TOP (10000000) n FROM Nums ORDER BY n OPTION (MAXDOP 1); Table T1 contains data like this: Next we load data into table T2. The relationship between the two tables is that table 2 contains ‘n’ rows for each row in table 1, where ‘n’ is determined by the value in Column1 of table T1. There is nothing particularly special about the data or distribution, by the way. INSERT dbo.T2 WITH (TABLOCKX) (TID, Column1) SELECT T.TID, N.n FROM dbo.T1 AS T JOIN dbo.Numbers AS N ON N.n >= 1 AND N.n <= T.Column1; Table T2 ends up containing about 15 million rows: The primary key for table T2 is a combination of TID and Column1. The data is partitioned according to the value in column TID alone. Partition Distribution The following query shows the number of rows in each partition of table T1: SELECT PartitionID = CA1.P, NumRows = COUNT_BIG(*) FROM dbo.T1 AS T CROSS APPLY (VALUES ($PARTITION.PFT(TID))) AS CA1 (P) GROUP BY CA1.P ORDER BY CA1.P; There are 40 partitions containing 125,000 rows (40 * 125k = 5m rows). The rightmost partition remains empty. The next query shows the distribution for table 2: SELECT PartitionID = CA1.P, NumRows = COUNT_BIG(*) FROM dbo.T2 AS T CROSS APPLY (VALUES ($PARTITION.PFT(TID))) AS CA1 (P) GROUP BY CA1.P ORDER BY CA1.P; There are roughly 375,000 rows in each partition (the rightmost partition is also empty): Ok, that’s the test data done. Test Query and Execution Plan The task is to count the rows resulting from joining tables 1 and 2 on the TID column: SET STATISTICS IO ON; DECLARE @s datetime2 = SYSUTCDATETIME(); SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID; SELECT DATEDIFF(Millisecond, @s, SYSUTCDATETIME()); SET STATISTICS IO OFF; The optimizer chooses a plan using parallel hash join, and partial aggregation: The Plan Explorer plan tree view shows accurate cardinality estimates and an even distribution of rows across threads (click to enlarge the image): With a warm data cache, the STATISTICS IO output shows that no physical I/O was needed, and all 41 partitions were touched: Running the query without actual execution plan or STATISTICS IO information for maximum performance, the query returns in around 2600ms. Execution Plan Analysis The first step toward improving on the execution plan produced by the query optimizer is to understand how it works, at least in outline. The two parallel Clustered Index Scans use multiple threads to read rows from tables T1 and T2. Parallel scan uses a demand-based scheme where threads are given page(s) to scan from the table as needed. This arrangement has certain important advantages, but does result in an unpredictable distribution of rows amongst threads. The point is that multiple threads cooperate to scan the whole table, but it is impossible to predict which rows end up on which threads. For correct results from the parallel hash join, the execution plan has to ensure that rows from T1 and T2 that might join are processed on the same thread. For example, if a row from T1 with join key value ‘1234’ is placed in thread 5’s hash table, the execution plan must guarantee that any rows from T2 that also have join key value ‘1234’ probe thread 5’s hash table for matches. The way this guarantee is enforced in this parallel hash join plan is by repartitioning rows to threads after each parallel scan. The two repartitioning exchanges route rows to threads using a hash function over the hash join keys. The two repartitioning exchanges use the same hash function so rows from T1 and T2 with the same join key must end up on the same hash join thread. Expensive Exchanges This business of repartitioning rows between threads can be very expensive, especially if a large number of rows is involved. The execution plan selected by the optimizer moves 5 million rows through one repartitioning exchange and around 15 million across the other. As a first step toward removing these exchanges, consider the execution plan selected by the optimizer if we join just one partition from each table, disallowing parallelism: SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = 1 AND $PARTITION.PFT(T2.TID) = 1 OPTION (MAXDOP 1); The optimizer has chosen a (one-to-many) merge join instead of a hash join. The single-partition query completes in around 100ms. If everything scaled linearly, we would expect that extending this strategy to all 40 populated partitions would result in an execution time around 4000ms. Using parallelism could reduce that further, perhaps to be competitive with the parallel hash join chosen by the optimizer. This raises a question. If the most efficient way to join one partition from each of the tables is to use a merge join, why does the optimizer not choose a merge join for the full query? Forcing a Merge Join Let’s force the optimizer to use a merge join on the test query using a hint: SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID OPTION (MERGE JOIN); This is the execution plan selected by the optimizer: This plan results in the same number of logical reads reported previously, but instead of 2600ms the query takes 5000ms. The natural explanation for this drop in performance is that the merge join plan is only using a single thread, whereas the parallel hash join plan could use multiple threads. Parallel Merge Join We can get a parallel merge join plan using the same query hint as before, and adding trace flag 8649: SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID OPTION (MERGE JOIN, QUERYTRACEON 8649); The execution plan is: This looks promising. It uses a similar strategy to distribute work across threads as seen for the parallel hash join. In practice though, performance is disappointing. On a typical run, the parallel merge plan runs for around 8400ms; slower than the single-threaded merge join plan (5000ms) and much worse than the 2600ms for the parallel hash join. We seem to be going backwards! The logical reads for the parallel merge are still exactly the same as before, with no physical IOs. The cardinality estimates and thread distribution are also still very good (click to enlarge): A big clue to the reason for the poor performance is shown in the wait statistics (captured by Plan Explorer Pro): CXPACKET waits require careful interpretation, and are most often benign, but in this case excessive waiting occurs at the repartitioning exchanges. Unlike the parallel hash join, the repartitioning exchanges in this plan are order-preserving ‘merging’ exchanges (because merge join requires ordered inputs): Parallelism works best when threads can just grab any available unit of work and get on with processing it. Preserving order introduces inter-thread dependencies that can easily lead to significant waits occurring. In extreme cases, these dependencies can result in an intra-query deadlock, though the details of that will have to wait for another time to explore in detail. The potential for waits and deadlocks leads the query optimizer to cost parallel merge join relatively highly, especially as the degree of parallelism (DOP) increases. This high costing resulted in the optimizer choosing a serial merge join rather than parallel in this case. The test results certainly confirm its reasoning. Collocated Joins In SQL Server 2008 and later, the optimizer has another available strategy when joining tables that share a common partition scheme. This strategy is a collocated join, also known as as a per-partition join. It can be applied in both serial and parallel execution plans, though it is limited to 2-way joins in the current optimizer. Whether the optimizer chooses a collocated join or not depends on cost estimation. The primary benefits of a collocated join are that it eliminates an exchange and requires less memory, as we will see next. Costing and Plan Selection The query optimizer did consider a collocated join for our original query, but it was rejected on cost grounds. The parallel hash join with repartitioning exchanges appeared to be a cheaper option. There is no query hint to force a collocated join, so we have to mess with the costing framework to produce one for our test query. Pretending that IOs cost 50 times more than usual is enough to convince the optimizer to use collocated join with our test query: -- Pretend IOs are 50x cost temporarily DBCC SETIOWEIGHT(50); -- Co-located hash join SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID OPTION (RECOMPILE); -- Reset IO costing DBCC SETIOWEIGHT(1); Collocated Join Plan The estimated execution plan for the collocated join is: The Constant Scan contains one row for each partition of the shared partitioning scheme, from 1 to 41. The hash repartitioning exchanges seen previously are replaced by a single Distribute Streams exchange using Demand partitioning. Demand partitioning means that the next partition id is given to the next parallel thread that asks for one. My test machine has eight logical processors, and all are available for SQL Server to use. As a result, there are eight threads in the single parallel branch in this plan, each processing one partition from each table at a time. Once a thread finishes processing a partition, it grabs a new partition number from the Distribute Streams exchange…and so on until all partitions have been processed. It is important to understand that the parallel scans in this plan are different from the parallel hash join plan. Although the scans have the same parallelism icon, tables T1 and T2 are not being co-operatively scanned by multiple threads in the same way. Each thread reads a single partition of T1 and performs a hash match join with the same partition from table T2. The properties of the two Clustered Index Scans show a Seek Predicate (unusual for a scan!) limiting the rows to a single partition: The crucial point is that the join between T1 and T2 is on TID, and TID is the partitioning column for both tables. A thread that processes partition ‘n’ is guaranteed to see all rows that can possibly join on TID for that partition. In addition, no other thread will see rows from that partition, so this removes the need for repartitioning exchanges. CPU and Memory Efficiency Improvements The collocated join has removed two expensive repartitioning exchanges and added a single exchange processing 41 rows (one for each partition id). Remember, the parallel hash join plan exchanges had to process 5 million and 15 million rows. The amount of processor time spent on exchanges will be much lower in the collocated join plan. In addition, the collocated join plan has a maximum of 8 threads processing single partitions at any one time. The 41 partitions will all be processed eventually, but a new partition is not started until a thread asks for it. Threads can reuse hash table memory for the new partition. The parallel hash join plan also had 8 hash tables, but with all 5,000,000 build rows loaded at the same time. The collocated plan needs memory for only 8 * 125,000 = 1,000,000 rows at any one time. Collocated Hash Join Performance The collated join plan has disappointing performance in this case. The query runs for around 25,300ms despite the same IO statistics as usual. This is much the worst result so far, so what went wrong? It turns out that cardinality estimation for the single partition scans of table T1 is slightly low. The properties of the Clustered Index Scan of T1 (graphic immediately above) show the estimation was for 121,951 rows. This is a small shortfall compared with the 125,000 rows actually encountered, but it was enough to cause the hash join to spill to physical tempdb: A level 1 spill doesn’t sound too bad, until you realize that the spill to tempdb probably occurs for each of the 41 partitions. As a side note, the cardinality estimation error is a little surprising because the system tables accurately show there are 125,000 rows in every partition of T1. Unfortunately, the optimizer uses regular column and index statistics to derive cardinality estimates here rather than system table information (e.g. sys.partitions). Collocated Merge Join We will never know how well the collocated parallel hash join plan might have worked without the cardinality estimation error (and the resulting 41 spills to tempdb) but we do know: Merge join does not require a memory grant; and Merge join was the optimizer’s preferred join option for a single partition join Putting this all together, what we would really like to see is the same collocated join strategy, but using merge join instead of hash join. Unfortunately, the current query optimizer cannot produce a collocated merge join; it only knows how to do collocated hash join. So where does this leave us? CROSS APPLY sys.partitions We can try to write our own collocated join query. We can use sys.partitions to find the partition numbers, and CROSS APPLY to get a count per partition, with a final step to sum the partial counts. The following query implements this idea: SELECT row_count = SUM(Subtotals.cnt) FROM ( -- Partition numbers SELECT p.partition_number FROM sys.partitions AS p WHERE p.[object_id] = OBJECT_ID(N'T1', N'U') AND p.index_id = 1 ) AS P CROSS APPLY ( -- Count per collocated join SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals; The estimated plan is: The cardinality estimates aren’t all that good here, especially the estimate for the scan of the system table underlying the sys.partitions view. Nevertheless, the plan shape is heading toward where we would like to be. Each partition number from the system table results in a per-partition scan of T1 and T2, a one-to-many Merge Join, and a Stream Aggregate to compute the partial counts. The final Stream Aggregate just sums the partial counts. Execution time for this query is around 3,500ms, with the same IO statistics as always. This compares favourably with 5,000ms for the serial plan produced by the optimizer with the OPTION (MERGE JOIN) hint. This is another case of the sum of the parts being less than the whole – summing 41 partial counts from 41 single-partition merge joins is faster than a single merge join and count over all partitions. Even so, this single-threaded collocated merge join is not as quick as the original parallel hash join plan, which executed in 2,600ms. On the positive side, our collocated merge join uses only one logical processor and requires no memory grant. The parallel hash join plan used 16 threads and reserved 569 MB of memory: Using a Temporary Table Our collocated merge join plan should benefit from parallelism. The reason parallelism is not being used is that the query references a system table. We can work around that by writing the partition numbers to a temporary table (or table variable): SET STATISTICS IO ON; DECLARE @s datetime2 = SYSUTCDATETIME(); CREATE TABLE #P ( partition_number integer PRIMARY KEY); INSERT #P (partition_number) SELECT p.partition_number FROM sys.partitions AS p WHERE p.[object_id] = OBJECT_ID(N'T1', N'U') AND p.index_id = 1; SELECT row_count = SUM(Subtotals.cnt) FROM #P AS p CROSS APPLY ( SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals; DROP TABLE #P; SELECT DATEDIFF(Millisecond, @s, SYSUTCDATETIME()); SET STATISTICS IO OFF; Using the temporary table adds a few logical reads, but the overall execution time is still around 3500ms, indistinguishable from the same query without the temporary table. The problem is that the query optimizer still doesn’t choose a parallel plan for this query, though the removal of the system table reference means that it could if it chose to: In fact the optimizer did enter the parallel plan phase of query optimization (running search 1 for a second time): Unfortunately, the parallel plan found seemed to be more expensive than the serial plan. This is a crazy result, caused by the optimizer’s cost model not reducing operator CPU costs on the inner side of a nested loops join. Don’t get me started on that, we’ll be here all night. In this plan, everything expensive happens on the inner side of a nested loops join. Without a CPU cost reduction to compensate for the added cost of exchange operators, candidate parallel plans always look more expensive to the optimizer than the equivalent serial plan. Parallel Collocated Merge Join We can produce the desired parallel plan using trace flag 8649 again: SELECT row_count = SUM(Subtotals.cnt) FROM #P AS p CROSS APPLY ( SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals OPTION (QUERYTRACEON 8649); The actual execution plan is: One difference between this plan and the collocated hash join plan is that a Repartition Streams exchange operator is used instead of Distribute Streams. The effect is similar, though not quite identical. The Repartition uses round-robin partitioning, meaning the next partition id is pushed to the next thread in sequence. The Distribute Streams exchange seen earlier used Demand partitioning, meaning the next partition id is pulled across the exchange by the next thread that is ready for more work. There are subtle performance implications for each partitioning option, but going into that would again take us too far off the main point of this post. Performance The important thing is the performance of this parallel collocated merge join – just 1350ms on a typical run. The list below shows all the alternatives from this post (all timings include creation, population, and deletion of the temporary table where appropriate) from quickest to slowest: Collocated parallel merge join: 1350ms Parallel hash join: 2600ms Collocated serial merge join: 3500ms Serial merge join: 5000ms Parallel merge join: 8400ms Collated parallel hash join: 25,300ms (hash spill per partition) The parallel collocated merge join requires no memory grant (aside from a paltry 1.2MB used for exchange buffers). This plan uses 16 threads at DOP 8; but 8 of those are (rather pointlessly) allocated to the parallel scan of the temporary table. These are minor concerns, but it turns out there is a way to address them if it bothers you. Parallel Collocated Merge Join with Demand Partitioning This final tweak replaces the temporary table with a hard-coded list of partition ids (dynamic SQL could be used to generate this query from sys.partitions): SELECT row_count = SUM(Subtotals.cnt) FROM ( VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10), (11),(12),(13),(14),(15),(16),(17),(18),(19),(20), (21),(22),(23),(24),(25),(26),(27),(28),(29),(30), (31),(32),(33),(34),(35),(36),(37),(38),(39),(40),(41) ) AS P (partition_number) CROSS APPLY ( SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals OPTION (QUERYTRACEON 8649); The actual execution plan is: The parallel collocated hash join plan is reproduced below for comparison: The manual rewrite has another advantage that has not been mentioned so far: the partial counts (per partition) can be computed earlier than the partial counts (per thread) in the optimizer’s collocated join plan. The earlier aggregation is performed by the extra Stream Aggregate under the nested loops join. The performance of the parallel collocated merge join is unchanged at around 1350ms. Final Words It is a shame that the current query optimizer does not consider a collocated merge join (Connect item closed as Won’t Fix). The example used in this post showed an improvement in execution time from 2600ms to 1350ms using a modestly-sized data set and limited parallelism. In addition, the memory requirement for the query was almost completely eliminated – down from 569MB to 1.2MB. The problem with the parallel hash join selected by the optimizer is that it attempts to process the full data set all at once (albeit using eight threads). It requires a large memory grant to hold all 5 million rows from table T1 across the eight hash tables, and does not take advantage of the divide-and-conquer opportunity offered by the common partitioning. The great thing about the collocated join strategies is that each parallel thread works on a single partition from both tables, reading rows, performing the join, and computing a per-partition subtotal, before moving on to a new partition. From a thread’s point of view… If you have trouble visualizing what is happening from just looking at the parallel collocated merge join execution plan, let’s look at it again, but from the point of view of just one thread operating between the two Parallelism (exchange) operators. Our thread picks up a single partition id from the Distribute Streams exchange, and starts a merge join using ordered rows from partition 1 of table T1 and partition 1 of table T2. By definition, this is all happening on a single thread. As rows join, they are added to a (per-partition) count in the Stream Aggregate immediately above the Merge Join. Eventually, either T1 (partition 1) or T2 (partition 1) runs out of rows and the merge join stops. The per-partition count from the aggregate passes on through the Nested Loops join to another Stream Aggregate, which is maintaining a per-thread subtotal. Our same thread now picks up a new partition id from the exchange (say it gets id 9 this time). The count in the per-partition aggregate is reset to zero, and the processing of partition 9 of both tables proceeds just as it did for partition 1, and on the same thread. Each thread picks up a single partition id and processes all the data for that partition, completely independently from other threads working on other partitions. One thread might eventually process partitions (1, 9, 17, 25, 33, 41) while another is concurrently processing partitions (2, 10, 18, 26, 34) and so on for the other six threads at DOP 8. The point is that all 8 threads can execute independently and concurrently, continuing to process new partitions until the wider job (of which the thread has no knowledge!) is done. This divide-and-conquer technique can be much more efficient than simply splitting the entire workload across eight threads all at once. Related Reading Understanding and Using Parallelism in SQL Server Parallel Execution Plans Suck © 2013 Paul White – All Rights Reserved Twitter: @SQL_Kiwi

Read the article
SQL SERVER – Difference Between DATETIME and DATETIME2

- by pinaldave

Yesterday I have written a very quick blog post on SQL SERVER – Difference Between GETDATE and SYSDATETIME and I got tremendous response for the same. I suggest you read that blog post before continuing this blog post today. I had asked people to honestly take part and share their view about above two system function. There are few emails as well few comments on the blog post asking question how did I come to know the difference between the same. The answer is real world issues. I was called in for performance tuning consultancy where I was asked very strange question by one developer. Here is the situation he was facing. System had a single table with two different column of datetime. One column was datelastmodified and second column was datefirstmodified. One of the column was DATETIME and another was DATETIME2. Developer was populating them with SYSDATETIME respectively. He was always thinking that the value inserted in the table will be the same. This table was only accessed by INSERT statement and there was no updates done over it in application.One fine day he ran distinct on both of this column and was in for surprise. He always thought that both of the table will have same data, but in fact they had very different data. He presented this scenario to me. I said this can not be possible but when looked at the resultset, I had to agree with him. Here is the simple script generated to demonstrate the problem he was facing. This is just a sample of original table. DECLARE @Intveral INT SET @Intveral = 10000 CREATE TABLE #TimeTable (FirstDate DATETIME, LastDate DATETIME2) WHILE (@Intveral > 0) BEGIN INSERT #TimeTable (FirstDate, LastDate) VALUES (SYSDATETIME(), SYSDATETIME()) SET @Intveral = @Intveral - 1 END GO SELECT COUNT(DISTINCT FirstDate) D_GETDATE, COUNT(DISTINCT LastDate) D_SYSGETDATE FROM #TimeTable GO SELECT DISTINCT a.FirstDate, b.LastDate FROM #TimeTable a INNER JOIN #TimeTable b ON a.FirstDate = b.LastDate GO SELECT * FROM #TimeTable GO DROP TABLE #TimeTable GO Let us see the resultset. You can clearly see from result that SYSDATETIME() does not populate the same value in the both of the field. In fact the value is either rounded down or rounded up in the field which is DATETIME. Event though we are populating the same value, the values are totally different in both the column resulting the SELF JOIN fail and display different DISTINCT values. The best policy is if you are using DATETIME use GETDATE() and if you are suing DATETIME2 use SYSDATETIME() to populate them with current date and time to accurately address the precision. As DATETIME2 is introduced in SQL Server 2008, above script will only work with SQL SErver 2008 and later versions. I hope I have answered few questions asked yesterday. Reference: Pinal Dave (http://www.SQLAuthority.com) Filed under: Pinal Dave, SQL, SQL Authority, SQL DateTime, SQL Optimization, SQL Performance, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
SSIS Lookup component tuning tips

- by jamiet

Yesterday evening I attended a London meeting of the UK SQL Server User Group at Microsoft’s offices in London Victoria. As usual it was both a fun and informative evening and in particular there seemed to be a few questions arising about tuning the SSIS Lookup component; I rattled off some comments and figured it would be prudent to drop some of them into a dedicated blog post, hence the one you are reading right now. Scene setting A popular pattern in SSIS is to use a Lookup component to determine whether a record in the pipeline already exists in the intended destination table or not and I cover this pattern in my 2006 blog post Checking if a row exists and if it does, has it changed? (note to self: must rewrite that blog post for SSIS2008). Fundamentally the SSIS lookup component (when using FullCache option) sucks some data out of a database and holds it in memory so that it can be compared to data in the pipeline. One of the big benefits of using SSIS dataflows is that they process data one buffer at a time; that means that not all of the data from your source exists in the dataflow at the same time and is why a SSIS dataflow can process data volumes that far exceed the available memory. However, that only applies to data in the pipeline; for reasons that are hopefully obvious ALL of the data in the lookup set must exist in the memory cache for the duration of the dataflow’s execution which means that any memory used by the lookup cache will not be available to be used as a pipeline buffer. Moreover, there’s an obvious correlation between the amount of data in the lookup cache and the time it takes to charge that cache; the more data you have then the longer it will take to charge and the longer you have to wait until the dataflow actually starts to do anything. For these reasons your goal is simple: ensure that the lookup cache contains as little data as possible. General tips Here is a simple tick list you can follow in order to tune your lookups: Use a SQL statement to charge your cache, don’t just pick a table from the dropdown list made available to you. (Read why in SELECT *... or select from a dropdown in an OLE DB Source component?) Only pick the columns that you need, ignore everything else Make the database columns that your cache is populated from as narrow as possible. If a column is defined as VARCHAR(20) then SSIS will allocate 20 bytes for every value in that column – that is a big waste if the actual values are significantly less than 20 characters in length. Do you need DT_WSTR typed columns or will DT_STR suffice? DT_WSTR uses twice the amount of space to hold values that can be stored using a DT_STR so if you can use DT_STR, consider doing so. Same principle goes for the numerical datatypes DT_I2/DT_I4/DT_I8. Only populate the cache with data that you KNOW you will need. In other words, think about your WHERE clause! Thinking outside the box It is tempting to build a large monolithic dataflow that does many things, one of which is a Lookup. Often though you can make better use of your available resources by, well, mixing things up a little and here are a few ideas to get your creative juices flowing: There is no rule that says everything has to happen in a single dataflow. If you have some particularly resource intensive lookups then consider putting that lookup into a dataflow all of its own and using raw files to pass the pipeline data in and out of that dataflow. Know your data. If you think, for example, that the majority of your incoming rows will match with only a small subset of your lookup data then consider chaining multiple lookup components together; the first would use a FullCache containing that data subset and the remaining data that doesn’t find a match could be passed to a second lookup that perhaps uses a NoCache lookup thus negating the need to pull all of that least-used lookup data into memory. Do you need to process all of your incoming data all at once? If you can process different partitions of your data separately then you can partition your lookup cache as well. For example, if you are using a lookup to convert a location into a [LocationId] then why not process your data one region at a time? This will mean your lookup cache only has to contain data for the location that you are currently processing and with the ability of the Lookup in SSIS2008 and beyond to charge the cache using a dynamically built SQL statement you’ll be able to achieve it using the same dataflow and simply loop over it using a ForEach loop. Taking the previous data partitioning idea further … a dataflow can contain more than one data path so why not split your data using a conditional split component and, again, charge your lookup caches with only the data that they need for that partition. Lookups have two uses: to (1) find a matching row from the lookup set and (2) put attributes from that matching row into the pipeline. Ask yourself, do you need to do these two things at the same time? After all once you have the key column(s) from your lookup set then you can use that key to get the rest of attributes further downstream, perhaps even in another dataflow. Are you using the same lookup data set multiple times? If so, consider the file caching option in SSIS 2008 and beyond. Above all, experiment and be creative with different combinations. You may be surprised at what works. Final thoughts If you want to know more about how the Lookup component differs in SSIS2008 from SSIS2005 then I have a dedicated blog post about that at Lookup component gets a makeover. I am on a mini-crusade at the moment to get a BULK MERGE feature into the database engine, the thinking being that if the database engine can quickly merge massive amounts of data in a similar manner to how it can insert massive amounts using BULK INSERT then that’s a lot of work that wouldn’t have to be done in the SSIS pipeline. If you think that is a good idea then go and vote for BULK MERGE on Connect. If you have any other tips to share then please stick them in the comments. Hope this helps! @Jamiet Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

Read the article
How can I get penetration depth from Minkowski Portal Refinement / Xenocollide?

- by Raven Dreamer

I recently got an implementation of Minkowski Portal Refinement (MPR) successfully detecting collision. Even better, my implementation returns a good estimate (local minimum) direction for the minimum penetration depth. So I took a stab at adjusting the algorithm to return the penetration depth in an arbitrary direction, and was modestly successful - my altered method works splendidly for face-edge collision resolution! What it doesn't currently do, is correctly provide the minimum penetration depth for edge-edge scenarios, such as the case on the right: What I perceive to be happening, is that my current method returns the minimum penetration depth to the nearest vertex - which works fine when the collision is actually occurring on the plane of that vertex, but not when the collision happens along an edge. Is there a way I can alter my method to return the penetration depth to the point of collision, rather than the nearest vertex? Here's the method that's supposed to return the minimum penetration distance along a specific direction: public static Vector3 CalcMinDistance(List<Vector3> shape1, List<Vector3> shape2, Vector3 dir) { //holding variables Vector3 n = Vector3.zero; Vector3 swap = Vector3.zero; // v0 = center of Minkowski sum v0 = Vector3.zero; // Avoid case where centers overlap -- any direction is fine in this case //if (v0 == Vector3.zero) return Vector3.zero; //always pass in a valid direction. // v1 = support in direction of origin n = -dir; //get the differnce of the minkowski sum Vector3 v11 = GetSupport(shape1, -n); Vector3 v12 = GetSupport(shape2, n); v1 = v12 - v11; //if the support point is not in the direction of the origin if (v1.Dot(n) <= 0) { //Debug.Log("Could find no points this direction"); return Vector3.zero; } // v2 - support perpendicular to v1,v0 n = v1.Cross(v0); if (n == Vector3.zero) { //v1 and v0 are parallel, which means //the direction leads directly to an endpoint n = v1 - v0; //shortest distance is just n //Debug.Log("2 point return"); return n; } //get the new support point Vector3 v21 = GetSupport(shape1, -n); Vector3 v22 = GetSupport(shape2, n); v2 = v22 - v21; if (v2.Dot(n) <= 0) { //can't reach the origin in this direction, ergo, no collision //Debug.Log("Could not reach edge?"); return Vector2.zero; } // Determine whether origin is on + or - side of plane (v1,v0,v2) //tests linesegments v0v1 and v0v2 n = (v1 - v0).Cross(v2 - v0); float dist = n.Dot(v0); // If the origin is on the - side of the plane, reverse the direction of the plane if (dist > 0) { //swap the winding order of v1 and v2 swap = v1; v1 = v2; v2 = swap; //swap the winding order of v11 and v12 swap = v12; v12 = v11; v11 = swap; //swap the winding order of v11 and v12 swap = v22; v22 = v21; v21 = swap; //and swap the plane normal n = -n; } /// // Phase One: Identify a portal while (true) { // Obtain the support point in a direction perpendicular to the existing plane // Note: This point is guaranteed to lie off the plane Vector3 v31 = GetSupport(shape1, -n); Vector3 v32 = GetSupport(shape2, n); v3 = v32 - v31; if (v3.Dot(n) <= 0) { //can't enclose the origin within our tetrahedron //Debug.Log("Could not reach edge after portal?"); return Vector3.zero; } // If origin is outside (v1,v0,v3), then eliminate v2 and loop if (v1.Cross(v3).Dot(v0) < 0) { //failed to enclose the origin, adjust points; v2 = v3; v21 = v31; v22 = v32; n = (v1 - v0).Cross(v3 - v0); continue; } // If origin is outside (v3,v0,v2), then eliminate v1 and loop if (v3.Cross(v2).Dot(v0) < 0) { //failed to enclose the origin, adjust points; v1 = v3; v11 = v31; v12 = v32; n = (v3 - v0).Cross(v2 - v0); continue; } bool hit = false; /// // Phase Two: Refine the portal int phase2 = 0; // We are now inside of a wedge... while (phase2 < 20) { phase2++; // Compute normal of the wedge face n = (v2 - v1).Cross(v3 - v1); n.Normalize(); // Compute distance from origin to wedge face float d = n.Dot(v1); // If the origin is inside the wedge, we have a hit if (d > 0 ) { //Debug.Log("Do plane test here"); float T = n.Dot(v2) / n.Dot(dir); Vector3 pointInPlane = (dir * T); return pointInPlane; } // Find the support point in the direction of the wedge face Vector3 v41 = GetSupport(shape1, -n); Vector3 v42 = GetSupport(shape2, n); v4 = v42 - v41; float delta = (v4 - v3).Dot(n); float separation = -(v4.Dot(n)); if (delta <= kCollideEpsilon || separation >= 0) { //Debug.Log("Non-convergance detected"); //Debug.Log("Do plane test here"); return Vector3.zero; } // Compute the tetrahedron dividing face (v4,v0,v1) float d1 = v4.Cross(v1).Dot(v0); // Compute the tetrahedron dividing face (v4,v0,v2) float d2 = v4.Cross(v2).Dot(v0); // Compute the tetrahedron dividing face (v4,v0,v3) float d3 = v4.Cross(v3).Dot(v0); if (d1 < 0) { if (d2 < 0) { // Inside d1 & inside d2 ==> eliminate v1 v1 = v4; v11 = v41; v12 = v42; } else { // Inside d1 & outside d2 ==> eliminate v3 v3 = v4; v31 = v41; v32 = v42; } } else { if (d3 < 0) { // Outside d1 & inside d3 ==> eliminate v2 v2 = v4; v21 = v41; v22 = v42; } else { // Outside d1 & outside d3 ==> eliminate v1 v1 = v4; v11 = v41; v12 = v42; } } } return Vector3.zero; } }

Read the article
ODI 12c - Parallel Table Load

- by David Allan

In this post we will look at the ODI 12c capability of parallel table load from the aspect of the mapping developer and the knowledge module developer - two quite different viewpoints. This is about parallel table loading which isn't to be confused with loading multiple targets per se. It supports the ability for ODI mappings to be executed concurrently especially if there is an overlap of the datastores that they access, so any temporary resources created may be uniquely constructed by ODI. Temporary objects can be anything basically - common examples are staging tables, indexes, views, directories - anything in the ETL to help the data integration flow do its job. In ODI 11g users found a few workarounds (such as changing the technology prefixes - see here) to build unique temporary names but it was more of a challenge in error cases. ODI 12c mappings by default operate exactly as they did in ODI 11g with respect to these temporary names (this is also true for upgraded interfaces and scenarios) but can be configured to support the uniqueness capabilities. We will look at this feature from two aspects; that of a mapping developer and that of a developer (of procedures or KMs). 1. Firstly as a Mapping Developer..... 1.1 Control when uniqueness is enabled A new property is available to set unique name generation on/off. When unique names have been enabled for a mapping, all temporary names used by the collection and integration objects will be generated using unique names. This property is presented as a check-box in the Property Inspector for a deployment specification. 1.2 Handle cleanup after successful execution Provided that all temporary objects that are created have a corresponding drop statement then all of the temporary objects should be removed during a successful execution. This should be the case with the KMs developed by Oracle. 1.3 Handle cleanup after unsuccessful execution If an execution failed in ODI 11g then temporary tables would have been left around and cleaned up in the subsequent run. In ODI 12c, KM tasks can now have a cleanup-type task which is executed even after a failure in the main tasks. These cleanup tasks will be executed even on failure if the property 'Remove Temporary Objects on Error' is set. If the agent was to crash and not be able to execute this task, then there is an ODI tool (OdiRemoveTemporaryObjects here) you can invoke to cleanup the tables - it supports date ranges and the like. That's all there is to it from the aspect of the mapping developer it's much, much simpler and straightforward. You can now execute the same mapping concurrently or execute many mappings using the same resource concurrently without worrying about conflict. 2. Secondly as a Procedure or KM Developer..... In the ODI Operator the executed code shows the actual name that is generated - you can also see the runtime code prior to execution (introduced in 11.1.1.7), for example below in the code type I selected 'Pre-executed Code' this lets you see the code about to be processed and you can also see the executed code (which is the default view). References to the collection (C$) and integration (I$) names will be automatically made unique by using the odiRef APIs - these objects will have unique names whenever concurrency has been enabled for a particular mapping deployment specification. It's also possible to use name uniqueness functions in procedures and your own KMs. 2.1 New uniqueness tags You can also make your own temporary objects have unique names by explicitly including either %UNIQUE_STEP_TAG or %UNIQUE_SESSION_TAG in the name passed to calls to the odiRef APIs. Such names would always include the unique tag regardless of the concurrency setting. To illustrate, let's look at the getObjectName() method. At <% expansion time, this API will append %UNIQUE_STEP_TAG to the object name for collection and integration tables. The name parameter passed to this API may contain %UNIQUE_STEP_TAG or %UNIQUE_SESSION_TAG. This API always generates to the <? version of getObjectName() At execution time this API will replace the unique tag macros with a string that is unique to the current execution scope. The returned name will conform to the name-length restriction for the target technology, and its pattern for the unique tag. Any necessary truncation will be performed against the initial name for the object and any other fixed text that may have been specified. Examples are:- <?=odiRef.getObjectName("L", "%COL_PRFEMP%UNIQUE_STEP_TAG", "D")?> SCOTT.C$_EABH7QI1BR1EQI3M76PG9SIMBQQ <?=odiRef.getObjectName("L", "EMP%UNIQUE_STEP_TAG_AE", "D")?> SCOTT.EMPAO96Q2JEKO0FTHQP77TMSAIOSR_ Methods which have this kind of support include getFrom, getTableName, getTable, getObjectShortName and getTemporaryIndex. There are APIs for retrieving this tag info also, the getInfo API has been extended with the following properties (the UNIQUE* properties can also be used in ODI procedures); UNIQUE_STEP_TAG - Returns the unique value for the current step scope, e.g. 5rvmd8hOIy7OU2o1FhsF61 Note that this will be a different value for each loop-iteration when the step is in a loop. UNIQUE_SESSION_TAG - Returns the unique value for the current session scope, e.g. 6N38vXLrgjwUwT5MseHHY9 IS_CONCURRENT - Returns info about the current mapping, will return 0 or 1 (only in % phase) GUID_SRC_SET - Returns the UUID for the current source set/execution unit (only in % phase) The getPop API has been extended with the IS_CONCURRENT property which returns info about an mapping, will return 0 or 1. 2.2 Additional APIs Some new APIs are provided including getFormattedName which will allow KM developers to construct a name from fixed-text or ODI symbols that can be optionally truncate to a max length and use a specific encoding for the unique tag. It has syntax getFormattedName(String pName[, String pTechnologyCode]) This API is available at both the % and the ? phase. The format string can contain the ODI prefixes that are available for getObjectName(), e.g. %INT_PRF, %COL_PRF, %ERR_PRF, %IDX_PRF alongwith %UNIQUE_STEP_TAG or %UNIQUE_SESSION_TAG. The latter tags will be expanded into a unique string according to the specified technology. Calls to this API within the same execution context are guaranteed to return the same unique name provided that the same parameters are passed to the call. e.g. <%=odiRef.getFormattedName("%COL_PRFMY_TABLE%UNIQUE_STEP_TAG_AE", "ORACLE")%> <?=odiRef.getFormattedName("%COL_PRFMY_TABLE%UNIQUE_STEP_TAG_AE", "ORACLE")?> C$_MY_TAB7wDiBe80vBog1auacS1xB_AE <?=odiRef.getFormattedName("%COL_PRFMY_TABLE%UNIQUE_STEP_TAG.log", "FILE")?> C2_MY_TAB7wDiBe80vBog1auacS1xB.log 2.3 Name length generation As part of name generation, the length of the generated name will be compared with the maximum length for the target technology and truncation may need to be applied. When a unique tag is included in the generated string it is important that uniqueness is not compromised by truncation of the unique tag. When a unique tag is NOT part of the generated name, the name will be truncated by removing characters from the end - this is the existing 11g algorithm. When a unique tag is included, the algorithm will first truncate the <postfix> and if necessary the <prefix>. It is recommended that users will ensure there is sufficient uniqueness in the <prefix> section to ensure uniqueness of the final resultant name. SUMMARY To summarize, ODI 12c make it much simpler to utilize mappings in concurrent cases and provides APIs for helping developing any procedures or custom knowledge modules in such a way they can be used in highly concurrent, parallel scenarios.

Read the article
Using a "white list" for extracting terms for Text Mining

- by [email protected]

In Part 1 of my post on "Generating cluster names from a document clustering model" (part 1, part 2, part 3), I showed how to build a clustering model from text documents using Oracle Data Miner, which automates preparing data for text mining. In this process we specified a custom stoplist and lexer and relied on Oracle Text to identify important terms. However, there is an alternative approach, the white list, which uses a thesaurus object with the Oracle Text CTXRULE index to allow you to specify the important terms. INTRODUCTIONA stoplist is used to exclude, i.e., black list, specific words in your documents from being indexed. For example, words like a, if, and, or, and but normally add no value when text mining. Other words can also be excluded if they do not help to differentiate documents, e.g., the word Oracle is ubiquitous in the Oracle product literature. One problem with stoplists is determining which words to specify. This usually requires inspecting the terms that are extracted, manually identifying which ones you don't want, and then re-indexing the documents to determine if you missed any. Since a corpus of documents could contain thousands of words, this could be a tedious exercise. Moreover, since every word is considered as an individual token, a term excluded in one context may be needed to help identify a term in another context. For example, in our Oracle product literature example, the words "Oracle Data Mining" taken individually are not particular helpful. The term "Oracle" may be found in nearly all documents, as with the term "Data." The term "Mining" is more unique, but could also refer to the Mining industry. If we exclude "Oracle" and "Data" by specifying them in the stoplist, we lose valuable information. But it we include them, they may introduce too much noise. Still, when you have a broad vocabulary or don't have a list of specific terms of interest, you rely on the text engine to identify important terms, often by computing the term frequency - inverse document frequency metric. (This is effectively a weight associated with each term indicating its relative importance in a document within a collection of documents. We'll revisit this later.) The results using this technique is often quite valuable. As noted above, an alternative to the subtractive nature of the stoplist is to specify a white list, or a list of terms--perhaps multi-word--that we want to extract and use for data mining. The obvious downside to this approach is the need to specify the set of terms of interest. However, this may not be as daunting a task as it seems. For example, in a given domain (Oracle product literature), there is often a recognized glossary, or a list of keywords and phrases (Oracle product names, industry names, product categories, etc.). Being able to identify multi-word terms, e.g., "Oracle Data Mining" or "Customer Relationship Management" as a single token can greatly increase the quality of the data mining results. The remainder of this post and subsequent posts will focus on how to produce a dataset that contains white list terms, suitable for mining. CREATING A WHITE LIST We'll leverage the thesaurus capability of Oracle Text. Using a thesaurus, we create a set of rules that are in effect our mapping from single and multi-word terms to the tokens used to represent those terms. For example, "Oracle Data Mining" becomes "ORACLEDATAMINING." First, we'll create and populate a mapping table called my_term_token_map. All text has been converted to upper case and values in the TERM column are intended to be mapped to the token in the TOKEN column. TERM TOKEN DATA MINING DATAMINING ORACLE DATA MINING ORACLEDATAMINING 11G ORACLE11G JAVA JAVA CRM CRM CUSTOMER RELATIONSHIP MANAGEMENT CRM ... Next, we'll create a thesaurus object my_thesaurus and a rules table my_thesaurus_rules: CTX_THES.CREATE_THESAURUS('my_thesaurus', FALSE); CREATE TABLE my_thesaurus_rules (main_term VARCHAR2(100), query_string VARCHAR2(400)); We next populate the thesaurus object and rules table using the term token map. A cursor is defined over my_term_token_map. As we iterate over the rows, we insert a synonym relationship 'SYN' into the thesaurus. We also insert into the table my_thesaurus_rules the main term, and the corresponding query string, which specifies synonyms for the token in the thesaurus. DECLARE cursor c2 is select token, term from my_term_token_map; BEGIN for r_c2 in c2 loop CTX_THES.CREATE_RELATION('my_thesaurus',r_c2.token,'SYN',r_c2.term); EXECUTE IMMEDIATE 'insert into my_thesaurus_rules values (:1,''SYN(' || r_c2.token || ', my_thesaurus)'')' using r_c2.token; end loop; END; We are effectively inserting the token to return and the corresponding query that will look up synonyms in our thesaurus into the my_thesaurus_rules table, for example: 'ORACLEDATAMINING' SYN ('ORACLEDATAMINING', my_thesaurus)At this point, we create a CTXRULE index on the my_thesaurus_rules table: create index my_thesaurus_rules_idx on my_thesaurus_rules(query_string) indextype is ctxsys.ctxrule; In my next post, this index will be used to extract the tokens that match each of the rules specified. We'll then compute the tf-idf weights for each of the terms and create a nested table suitable for mining.

Read the article
Passthrough Objects – Duck Typing++

- by EltonStoneman

[Source: http://geekswithblogs.net/EltonStoneman] Can't see a genuine use for this, but I got the idea in my head and wanted to work it through. It's an extension to the idea of duck typing, for scenarios where types have similar behaviour, but implemented in differently-named members. So you may have a set of objects you want to treat as an interface, which don't implement the interface explicitly, and don't have the same member names so they can't be duck-typed into implicitly implementing the interface. In a fictitious example, I want to call Get on whichever ICache implementation is current, and have the call passed through to the relevant method – whether it's called Read, Retrieve or whatever: A sample implementation is up on github here: PassthroughSample. This uses Castle's DynamicProxy behind the scenes in the same way as my duck typing sample, but allows you to configure the passthrough to specify how the inner (implementation) and outer (interface) members are mapped: var setup = new Passthrough(); var cache = setup.Create("PassthroughSample.Tests.Stubs.AspNetCache, PassthroughSample.Tests") .WithPassthrough("Name", "CacheName") .WithPassthrough("Get", "Retrieve") .WithPassthrough("Set", "Insert") .As<ICache>(); - or using some ugly Lambdas to avoid the strings : Expression<Func<ICache, string, object>> get = (o, s) => o.Get(s); Expression<Func<Memcached, string, object>> read = (i, s) => i.Read(s); Expression<Action<ICache, string, object>> set = (o, s, obj) => o.Set(s, obj); Expression<Action<Memcached, string, object>> insert = (i, s, obj) => i.Put(s, obj); ICache cache = new Passthrough<ICache, Memcached>() .Create() .WithPassthrough(o => o.Name, i => i.InstanceName) .WithPassthrough(get, read) .WithPassthrough(set, insert) .As(); - or even in config: ICache cache = Passthrough.GetConfigured<ICache>(); ... <passthrough> <types> <typename="PassthroughSample.Tests.Stubs.ICache, PassthroughSample.Tests" passesThroughTo="PassthroughSample.Tests.Stubs.AppFabricCache, PassthroughSample.Tests"> <members> <membername="Name"passesThroughTo="RegionName"/> <membername="Get"passesThroughTo="Out"/> <membername="Set"passesThroughTo="In"/> </members> </type> Possibly useful for injecting stubs for dependencies in tests, when your application code isn't using an IoC container. Possibly it also has an alternative implementation using .NET 4.0 dynamic objects, rather than the dynamic proxy.

Read the article
SQL SERVER – guest User and MSDB Database – Enable guest User on MSDB Database

- by pinaldave

I have written a few articles recently on the subject of guest account. Here’s a quick list of these articles: SQL SERVER – Disable Guest Account – Serious Security Issue SQL SERVER – Force Removing User from Database – Fix: Error: Could not drop login ‘test’ as the user is currently logged in. SQL SERVER – Detecting guest User Permissions – guest User Access Status One of the advices which I gave in all the three blog posts was: Disable the guest user in the user-created database. Additionally, I have mentioned that one should let the user account become enabled in MSDB database. I got many questions asking if there is any specific reason why this should be kept enabled, questions like, “What is the reason that MSDB database needs guest user?” Honestly, I did not know that the concept of the guest user will create so much interest in the readers. So now let’s turn this blog post into questions and answers format. Q: What will happen if the guest user is disabled in MSDB database? A: Lots of bad things will happen. Error 916 - Logins can connect to this instance of SQL Server but they do not have specific permissions in a database to receive the permissions of the guest user. Q: How can I determine if the guest user is enabled or disabled for any specific database? A: There are many ways to do this. Make sure that you run each of these methods with the context of the database. For an example for msdb database, you can run the following code: USE msdb; SELECT name, permission_name, state_desc FROM sys.database_principals dp INNER JOIN sys.server_permissions sp ON dp.principal_id = sp.grantee_principal_id WHERE name = 'guest' AND permission_name = 'CONNECT' There are many other methods to detect the guest user status. Read them here: Detecting guest User Permissions – guest User Access Status Q: What is the default status of the guest user account in database? A: Enabled in master, TempDb, and MSDB. Disabled in model database. Q: Why is the default status of the guest user disabled in model database? A: It is not recommended to enable the guest in user database as it can introduce serious security threat. It can seriously damage the database if configured incorrectly. Read more here: Disable Guest Account – Serious Security Issue Q: How to disable guest user? A: REVOKE CONNECT FROM guest Q: How to enable guest user? A: GRANT CONNECT TO guest Did I miss any critical question in the list? Please leave your question as a comment and I will add it to this list. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Security, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
How to get over “Did I lock the door?” syndrome

- by Boonei

I am person who always asks myself ”Did I lock the house door?”, And I do ask that question when I have almost reached office. I don’t have a bad memory or I am not a “forget it all after a min person”. Infact I have a fantastic memory of things. This problem has been haunting me for a very long time. My wife used to always have a angry face after we had get down from the car. Because after we have walked for about 20 yards I would run back to the car to check if I had locked the car, you see this problem exists for all locked objects. This happens everyday all round the year. Now a days I don’t have the problem ! I did not get the solution from any doctor or any book that that talks about my inner mind. It was a practical advice given by my aunt….. When I told her that I had this problem, she smiled and said its very very easy to get around this. I was stunned. The solution she gave me was simple. After I had locked the door, should hold the lock and look at it for 5 sec and say to myself “I have locked the door”. Believe me it works like a charm. The reason why it works is my aunt goes to explain, that your mind always thinks twice of important things that we do on our daily life and raises doubts after sometime. The only way to stop is it by looking at it, holding it and telling yourself that its ok and its done. This holds good for all the things that you generally doubt like, did I turn off the AC?, did I turn off the lights in the house when I left?. Just look at it for 5 sec, hold it tell yourself its done. You will not look back. Image credit [Håkan Dahlström] This article titled,How to get over “Did I lock the door?” syndrome, was originally published at Tech Dreams. Grab our rss feed or fan us on Facebook to get updates from us.

Read the article
Backup Meta-Data

- by BuckWoody

I'm working on a PowerShell script to show me the trending durations of my backup activities. The first thing I need is the data, so I looked at the Standard Reports in SQL Server Management Studio, and found a report that suited my needs, so I pulled out the script that it runs and modified it to this T-SQL Script. A few words here - you need to be in the MSDB database for this to run, and you can add a WHERE clause to limit to a database, timeframe, type of backup, whatever. For that matter, I won't use all of the data in this query in my PowerShell script, but it gives me lots of avenues to graph: SELECT distinct t1.name AS 'DatabaseName' ,(datediff( ss, t3.backup_start_date, t3.backup_finish_date)) AS 'DurationInSeconds' ,t3.user_name AS 'UserResponsible' ,t3.name AS backup_name ,t3.description ,t3.backup_start_date ,t3.backup_finish_date ,CASE WHEN t3.type = 'D' THEN 'Database' WHEN t3.type = 'L' THEN 'Log' WHEN t3.type = 'F' THEN 'FileOrFilegroup' WHEN t3.type = 'G' THEN 'DifferentialFile' WHEN t3.type = 'P' THEN 'Partial' WHEN t3.type = 'Q' THEN 'DifferentialPartial' END AS 'BackupType' ,t3.backup_size AS 'BackupSizeKB' ,t6.physical_device_name ,CASE WHEN t6.device_type = 2 THEN 'Disk' WHEN t6.device_type = 102 THEN 'Disk' WHEN t6.device_type = 5 THEN 'Tape' WHEN t6.device_type = 105 THEN 'Tape' END AS 'DeviceType' ,t3.recovery_model FROM sys.databases t1 INNER JOIN backupset t3 ON (t3.database_name = t1.name ) LEFT OUTER JOIN backupmediaset t5 ON ( t3.media_set_id = t5.media_set_id ) LEFT OUTER JOIN backupmediafamily t6 ON ( t6.media_set_id = t5.media_set_id ) ORDER BY backup_start_date DESC I'll munge this into my Excel PowerShell chart script tomorrow. Script Disclaimer, for people who need to be told this sort of thing: Never trust any script, including those that you find here, until you understand exactly what it does and how it will act on your systems. Always check the script on a test system or Virtual Machine, not a production system. Yes, there are always multiple ways to do things, and this script may not work in every situation, for everything. It’s just a script, people. All scripts on this site are performed by a professional stunt driver on a closed course. Your mileage may vary. Void where prohibited. Offer good for a limited time only. Keep out of reach of small children. Do not operate heavy machinery while using this script. If you experience blurry vision, indigestion or diarrhea during the operation of this script, see a physician immediately. Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

Read the article
Backup Meta-Data

- by BuckWoody

I'm working on a PowerShell script to show me the trending durations of my backup activities. The first thing I need is the data, so I looked at the Standard Reports in SQL Server Management Studio, and found a report that suited my needs, so I pulled out the script that it runs and modified it to this T-SQL Script. A few words here - you need to be in the MSDB database for this to run, and you can add a WHERE clause to limit to a database, timeframe, type of backup, whatever. For that matter, I won't use all of the data in this query in my PowerShell script, but it gives me lots of avenues to graph: SELECT distinct t1.name AS 'DatabaseName' ,(datediff( ss, t3.backup_start_date, t3.backup_finish_date)) AS 'DurationInSeconds' ,t3.user_name AS 'UserResponsible' ,t3.name AS backup_name ,t3.description ,t3.backup_start_date ,t3.backup_finish_date ,CASE WHEN t3.type = 'D' THEN 'Database' WHEN t3.type = 'L' THEN 'Log' WHEN t3.type = 'F' THEN 'FileOrFilegroup' WHEN t3.type = 'G' THEN 'DifferentialFile' WHEN t3.type = 'P' THEN 'Partial' WHEN t3.type = 'Q' THEN 'DifferentialPartial' END AS 'BackupType' ,t3.backup_size AS 'BackupSizeKB' ,t6.physical_device_name ,CASE WHEN t6.device_type = 2 THEN 'Disk' WHEN t6.device_type = 102 THEN 'Disk' WHEN t6.device_type = 5 THEN 'Tape' WHEN t6.device_type = 105 THEN 'Tape' END AS 'DeviceType' ,t3.recovery_model FROM sys.databases t1 INNER JOIN backupset t3 ON (t3.database_name = t1.name ) LEFT OUTER JOIN backupmediaset t5 ON ( t3.media_set_id = t5.media_set_id ) LEFT OUTER JOIN backupmediafamily t6 ON ( t6.media_set_id = t5.media_set_id ) ORDER BY backup_start_date DESC I'll munge this into my Excel PowerShell chart script tomorrow. Script Disclaimer, for people who need to be told this sort of thing: Never trust any script, including those that you find here, until you understand exactly what it does and how it will act on your systems. Always check the script on a test system or Virtual Machine, not a production system. Yes, there are always multiple ways to do things, and this script may not work in every situation, for everything. It’s just a script, people. All scripts on this site are performed by a professional stunt driver on a closed course. Your mileage may vary. Void where prohibited. Offer good for a limited time only. Keep out of reach of small children. Do not operate heavy machinery while using this script. If you experience blurry vision, indigestion or diarrhea during the operation of this script, see a physician immediately. Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

Read the article
Is your test method self-validating ?

- by mehfuzh

Writing state of art unit tests that can validate your every part of the framework is challenging and interesting at the same time, its like becoming a samurai. One of the key concept in this is to keep our test synced all the time as underlying code changes and thus breaking them to the furthest unit as possible. This also means, we should avoid multiple conditions embedded in a single test. Let’s consider the following example of transfer funds. [Fact] public void ShouldAssertTranserFunds() { var currencyService = Mock.Create<ICurrencyService>(); //// current rate Mock.Arrange(() => currencyService.GetConversionRate("AUS", "CAD")).Returns(0.88f); Account to = new Account { Currency = "AUS", Balance = 120 }; Account from = new Account { Currency = "CAD" }; AccountService accService = new AccountService(currencyService); Assert.Throws<InvalidOperationException>(() => accService.TranferFunds(to, from, 200f)); accService.TranferFunds(to, from, 100f); Assert.Equal(from.Balance, 88); Assert.Equal(20, to.Balance); } At first look, it seems ok but as you look more closely , it is actually doing two tasks in one test. At line# 10 it is trying to validate the exception for invalid fund transfer and finally it is asserting if the currency conversion is successfully made. Here, the name of the test itself is pretty vague. The first rule for writing unit test should always reflect to inner working of the target code, where just by looking at their names it is self explanatory. Having a obscure name for a test method not only increase the chances of cluttering the test code, but it also gives the opportunity to add multiple paths into it and eventually makes things messy as possible. I would rater have two test methods that explicitly describes its intent and are more self-validating. ShouldThrowExceptionForInvalidTransferOperation ShouldAssertTransferForExpectedConversionRate Having, this type of breakdown also helps us pin-point reported bugs easily rather wasting any time on debugging for something more general and can minimize confusion among team members. Finally, we should always make our test F.I.R.S.T ( Fast.Independent.Repeatable.Self-validating.Timely) [ Bob martin – Clean Code]. Only this will be enough to ensure, our test is as simple and clean as possible. Hope that helps

Read the article
June 2013 Release of the Ajax Control Toolkit

- by Stephen.Walther

I’m happy to announce the June 2013 release of the Ajax Control Toolkit. For this release, we enhanced the AjaxFileUpload control to support uploading files directly to Windows Azure. We also improved the SlideShow control by adding support for CSS3 animations. You can get the latest release of the Ajax Control Toolkit by visiting the project page at CodePlex (http://AjaxControlToolkit.CodePlex.com). Alternatively, you can execute the following NuGet command from the Visual Studio Library Package Manager window: Uploading Files to Azure The AjaxFileUpload control enables you to efficiently upload large files and display progress while uploading. With this release, we’ve added support for uploading large files directly to Windows Azure Blob Storage (You can continue to upload to your server hard drive if you prefer). Imagine, for example, that you have created an Azure Blob Storage container named pictures. In that case, you can use the following AjaxFileUpload control to upload to the container: <toolkit:ToolkitScriptManager runat="server" /> <toolkit:AjaxFileUpload ID="AjaxFileUpload1" StoreToAzure="true" AzureContainerName="pictures" runat="server" /> Notice that the AjaxFileUpload control is declared with two properties related to Azure. The StoreToAzure property causes the AjaxFileUpload control to upload a file to Azure instead of the local computer. The AzureContainerName property points to the blob container where the file is uploaded. .int3{position:absolute;clip:rect(487px,auto,auto,444px);}SMALL cash advance VERY CHEAP To use the AjaxFileUpload control, you need to modify your web.config file so it contains some additional settings. You need to configure the AjaxFileUpload handler and you need to point your Windows Azure connection string to your Blob Storage account. <configuration> <appSettings>  <add key="AjaxFileUploadAzureConnectionString" value="DefaultEndpointsProtocol=https;AccountName=testact;AccountKey=RvqL89Iw4npvPlAAtpOIPzrinHkhkb6rtRZmD0+ojZupUWuuAVJRyyF/LIVzzkoN38I4LSr8qvvl68sZtA152A=="/> </appSettings> <system.web> <compilation debug="true" targetFramework="4.5" /> <httpRuntime targetFramework="4.5" /> <httpHandlers> <add verb="*" path="AjaxFileUploadHandler.axd" type="AjaxControlToolkit.AjaxFileUploadHandler, AjaxControlToolkit"/> </httpHandlers> </system.web> <system.webServer> <validation validateIntegratedModeConfiguration="false" /> <handlers> <add name="AjaxFileUploadHandler" verb="*" path="AjaxFileUploadHandler.axd" type="AjaxControlToolkit.AjaxFileUploadHandler, AjaxControlToolkit"/> </handlers> <security> <requestFiltering> <requestLimits maxAllowedContentLength="4294967295"/> </requestFiltering> </security> </system.webServer> </configuration> You supply the connection string for your Azure Blob Storage account with the AjaxFileUploadAzureConnectionString property. If you set the value “UseDevelopmentStorage=true” then the AjaxFileUpload will upload to the simulated Blob Storage on your local machine. After you create the necessary configuration settings, you can use the AjaxFileUpload control to upload files directly to Azure (even very large files). Here’s a screen capture of how the AjaxFileUpload control appears in Google Chrome: After the files are uploaded, you can view the uploaded files in the Windows Azure Portal. You can see that all 5 files were uploaded successfully: New AjaxFileUpload Events In response to user feedback, we added two new events to the AjaxFileUpload control (on both the server and the client): · UploadStart – Raised on the server before any files have been uploaded. · UploadCompleteAll – Raised on the server when all files have been uploaded. · OnClientUploadStart – The name of a function on the client which is called before any files have been uploaded. · OnClientUploadCompleteAll – The name of a function on the client which is called after all files have been uploaded. These new events are most useful when uploading multiple files at a time. The updated AjaxFileUpload sample page demonstrates how to use these events to show the total amount of time required to upload multiple files (see the AjaxFileUpload.aspx file in the Ajax Control Toolkit sample site). SlideShow Animated Slide Transitions With this release of the Ajax Control Toolkit, we also added support for CSS3 animations to the SlideShow control. The animation is used when transitioning from one slide to another. Here’s the complete list of animations: · FadeInFadeOut · ScaleX · ScaleY · ZoomInOut · Rotate · SlideLeft · SlideDown You specify the animation which you want to use by setting the SlideShowAnimationType property. For example, here is how you would use the Rotate animation when displaying a set of slides: <%@ Page Language="C#" AutoEventWireup="true" CodeBehind="ShowSlideShow.aspx.cs" Inherits="TestACTJune2013.ShowSlideShow" %> <%@ Register TagPrefix="toolkit" Namespace="AjaxControlToolkit" Assembly="AjaxControlToolkit" %> <script runat="Server" type="text/C#"> [System.Web.Services.WebMethod] [System.Web.Script.Services.ScriptMethod] public static AjaxControlToolkit.Slide[] GetSlides() { return new AjaxControlToolkit.Slide[] { new AjaxControlToolkit.Slide("slides/Blue hills.jpg", "Blue Hills", "Go Blue"), new AjaxControlToolkit.Slide("slides/Sunset.jpg", "Sunset", "Setting sun"), new AjaxControlToolkit.Slide("slides/Winter.jpg", "Winter", "Wintery..."), new AjaxControlToolkit.Slide("slides/Water lilies.jpg", "Water lillies", "Lillies in the water"), new AjaxControlToolkit.Slide("slides/VerticalPicture.jpg", "Sedona", "Portrait style picture") }; } </script> <!DOCTYPE html> <html > <head runat="server"> <title></title> </head> <body> <form id="form1" runat="server"> <div> <toolkit:ToolkitScriptManager ID="ToolkitScriptManager1" runat="server" /> <asp:Image ID="Image1" Height="300" Runat="server" /> <toolkit:SlideShowExtender ID="SlideShowExtender1" TargetControlID="Image1" SlideShowServiceMethod="GetSlides" AutoPlay="true" Loop="true" SlideShowAnimationType="Rotate" runat="server" /> </div> </form> </body> </html> In the code above, the set of slides is exposed by a page method named GetSlides(). The SlideShowAnimationType property is set to the value Rotate. The following animated GIF gives you an idea of the resulting slideshow: If you want to use either the SlideDown or SlideRight animations, then you must supply both an explicit width and height for the Image control which is the target of the SlideShow extender. For example, here is how you would declare an Image and SlideShow control to use a SlideRight animation: <toolkit:ToolkitScriptManager ID="ToolkitScriptManager1" runat="server" /> <asp:Image ID="Image1" Height="300" Width="300" Runat="server" /> <toolkit:SlideShowExtender ID="SlideShowExtender1" TargetControlID="Image1" SlideShowServiceMethod="GetSlides" AutoPlay="true" Loop="true" SlideShowAnimationType="SlideRight" runat="server" /> Notice that the Image control includes both a Height and Width property. Here’s an approximation of this animation using an animated GIF: Summary The Superexpert team worked hard on this release. We hope you like the new improvements to both the AjaxFileUpload and the SlideShow controls. We’d love to hear your feedback in the comments. On to the next sprint!

Read the article
SQL SERVER – Detecting guest User Permissions – guest User Access Status

- by pinaldave

Earlier I wrote the blog post SQL SERVER – Disable Guest Account – Serious Security Issue, and I got many comments asking questions related to the guest user. Here are the comments of Manoj: 1) How do we know if the uest user is enabled or disabled? 2) What is the default for guest user in SQL Server? Default settings for guest user When SQL Server is installed by default, the guest user is disabled for security reasons. If the guest user is not properly configured, it can create a major security issue. You can read more about this here. Identify guest user status There are multiple ways to identify guest user status: Using SQL Server Management Studio (SSMS) You can expand the database node >> Security >> Users. If you see the RED arrow pointing downward, it means that the guest user is disabled. Using sys.sysusers Here is a simple script. If you notice column dbaccess as 1, it means that the guest user is enabled and has access to the database. SELECT name, hasdbaccess FROM sys.sysusers WHERE name = 'guest' Using sys.database_principals and sys.server_permissions This script is valid in SQL Server 2005 and a later version. This is my default method recently. SELECT name, permission_name, state_desc FROM sys.database_principals dp INNER JOIN sys.server_permissions sp ON dp.principal_id = sp.grantee_principal_id WHERE name = 'guest' AND permission_name = 'CONNECT' Using sp_helprotect Just run the following stored procedure which will give you all the permissions associated with the user. sp_helprotect @username = 'guest' Disable Guest Account REVOKE CONNECT FROM guest Additionally, the guest account cannot be disabled in master and tempdb; it is always enabled. There is a special need for this. Let me ask a question back at you: In which scenario do you think this will be useful to keep the guest, and what will the additional configuration go along with the scenario? Note: Special mention to Imran Mohammed for being always there when users need help. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Security, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
Working with Joins in LINQ

- by vik20000in

While working with data most of the time we have to work with relation between different lists of data. Many a times we want to fetch data from both the list at once. This requires us to make different kind of joins between the lists of data. LINQ support different kinds of join Inner Join List<Customer> customers = GetCustomerList(); List<Supplier> suppliers = GetSupplierList(); var custSupJoin = from sup in suppliers join cust in customers on sup.Country equals cust.Country select new { Country = sup.Country, SupplierName = sup.SupplierName, CustomerName = cust.CompanyName }; Group Join – where By the joined dataset is also grouped. List<Customer> customers = GetCustomerList(); List<Supplier> suppliers = GetSupplierList(); var custSupQuery = from sup in suppliers join cust in customers on sup.Country equals cust.Country into cs select new { Key = sup.Country, Items = cs }; We can also work with the Left outer join in LINQ like this. List<Customer> customers = GetCustomerList(); List<Supplier> suppliers = GetSupplierList(); var supplierCusts = from sup in suppliers join cust in customers on sup.Country equals cust.Country into cs from c in cs.DefaultIfEmpty() // DefaultIfEmpty preserves left-hand elements that have no matches on the right side orderby sup.SupplierName select new { Country = sup.Country, CompanyName = c == null ? "(No customers)" : c.CompanyName, SupplierName = sup.SupplierName};Vikram

Read the article
Combination of Operating Mode and Commit Strategy

- by Kevin Yang

If you want to populate a source into multiple targets, you may also want to ensure that every row from the source affects all targets uniformly (or separately). Let’s consider the Example Mapping below. If a row from SOURCE causes different changes in multiple targets (TARGET_1, TARGET_2 and TARGET_3), for example, it can be successfully inserted into TARGET_1 and TARGET_3, but failed to be inserted into TARGET_2, and the current Mapping Property TLO (target load order) is “TARGET_1 -> TARGET_2 -> TARGET_3”. What should Oracle Warehouse Builder do, in order to commit the appropriate data to all affected targets at the same time? If it doesn’t behave as you intended, the data could become inaccurate and possibly unusable. Example Mapping In OWB, we can use Mapping Configuration Commit Strategies and Operating Modes together to achieve this kind of requirements. Below we will explore the combination of these two features and how they affect the results in the target tables Before going to the example, let’s review some of the terms we will be using (Details can be found in white paper Oracle® Warehouse Builder Data Modeling, ETL, and Data Quality Guide11g Release 2): Operating Modes: Set-Based Mode: Warehouse Builder generates a single SQL statement that processes all data and performs all operations. Row-Based Mode: Warehouse Builder generates statements that process data row by row. The select statement is in a SQL cursor. All subsequent statements are PL/SQL. Row-Based (Target Only) Mode: Warehouse Builder generates a cursor select statement and attempts to include as many operations as possible in the cursor. For each target, Warehouse Builder inserts each row into the target separately. Commit Strategies: Automatic: Warehouse Builder loads and then automatically commits data based on the mapping design. If the mapping has multiple targets, Warehouse Builder commits and rolls back each target separately and independently of other targets. Use the automatic commit when the consequences of multiple targets being loaded unequally are not great or are irrelevant. Automatic correlated: It is a specialized type of automatic commit that applies to PL/SQL mappings with multiple targets only. Warehouse Builder considers all targets collectively and commits or rolls back data uniformly across all targets. Use the correlated commit when it is important to ensure that every row in the source affects all affected targets uniformly. Manual: select manual commit control for PL/SQL mappings when you want to interject complex business logic, perform validations, or run other mappings before committing data. Combination of the commit strategy and operating mode To understand the effects of each combination of operating mode and commit strategy, I’ll illustrate using the following example Mapping. Firstly we insert 100 rows into the SOURCE table and make sure that the 99th row and 100th row have the same ID value. And then we create a unique key constraint on ID column for TARGET_2 table. So while running the example mapping, OWB tries to load all 100 rows to each of the targets. But the mapping should fail to load the 100th row to TARGET_2, because it will violate the unique key constraint of table TARGET_2. With different combinations of Commit Strategy and Operating Mode, here are the results ¦ Set-based/ Correlated Commit: Configuration of Example mapping: Result: What’s happening: A single error anywhere in the mapping triggers the rollback of all data. OWB encounters the error inserting into Target_2, it reports an error for the table and does not load the row. OWB rolls back all the rows inserted into Target_1 and does not attempt to load rows to Target_3. No rows are added to any of the target tables. ¦ Row-based/ Correlated Commit: Configuration of Example mapping: Result: What’s happening: OWB evaluates each row separately and loads it to all three targets. Loading continues in this way until OWB encounters an error loading row 100th to Target_2. OWB reports the error and does not load the row. It rolls back the row 100th previously inserted into Target_1 and does not attempt to load row 100 to Target_3. Then, if there are remaining rows, OWB will continue loading them, resuming with loading rows to Target_1. The mapping completes with 99 rows inserted into each target. ¦ Set-based/ Automatic Commit: Configuration of Example mapping: Result: What’s happening: When OWB encounters the error inserting into Target_2, it does not load any rows and reports an error for the table. It does, however, continue to insert rows into Target_3 and does not roll back the rows previously inserted into Target_1. The mapping completes with one error message for Target_2, no rows inserted into Target_2, and 100 rows inserted into Target_1 and Target_3 separately. ¦ Row-based/Automatic Commit: Configuration of Example mapping: Result: What’s happening: OWB evaluates each row separately for loading into the targets. Loading continues in this way until OWB encounters an error loading row 100 to Target_2 and reports the error. OWB does not roll back row 100th from Target_1, does insert it into Target_3. If there are remaining rows, it will continue to load them. The mapping completes with 99 rows inserted into Target_2 and 100 rows inserted into each of the other targets. Note: Automatic Correlated commit is not applicable for row-based (target only). If you design a mapping with the row-based (target only) and correlated commit combination, OWB runs the mapping but does not perform the correlated commit. In set-based mode, correlated commit may impact the size of your rollback segments. Space for rollback segments may be a concern when you merge data (insert/update or update/insert). Correlated commit operates transparently with PL/SQL bulk processing code. The correlated commit strategy is not available for mappings run in any mode that are configured for Partition Exchange Loading or that include a Queue, Match Merge, or Table Function operator. If you want to practice in your own environment, you can follow the steps: 1. Import the MDL file: commit_operating_mode.mdl 2. Fix the location for oracle module ORCL and deploy all tables under it. 3. Insert sample records into SOURCE table, using below plsql code: begin for i in 1..99 loop insert into source values(i, 'col_'||i); end loop; insert into source values(99, 'col_99'); end; 4. Configure MAPPING_1 to any combinations of operating mode and commit strategy you want to test. And make sure feature TLO of mapping is open. 5. Deploy Mapping “MAPPING_1”. 6. Run the mapping and check the result.

Read the article
At most how many customized P3 attributes could be added into Agile?

- by Jie Chen

I have one customer/Oracle Partner Consultant asking me such question: how many customized attributes can be allowed to add to Agile's subclass Page Three? I never did research against this because Agile User Guide never says this and theoretically Agile supports unlimited amount of customized attributes, unless the browser itself cannot handle them in allocated memory. However my customers says when to add almost 1000 attributes, the browser (Web Client) will not show any Page Three attributes, including all the out-of-box attributes. Let's see why. Analysis It is horrible to add 1000 attributes manually. Let's do it by a batch SQL like below to add them to Item's subclass Page Three tab. Do not execute below SQL because it will not take effect due to your different node id. CREATE OR REPLACE PROCEDURE createP3Text(v_name IN VARCHAR2) IS v_nid NUMBER; v_pid NUMBER; BEGIN select SEQNODETABLE.nextval into v_nid from dual; Insert Into nodeTable ( id,parentID,description,objType,inherit,helpID,version,name ) values ( v_nid,2473003, v_name ,1,0,0,0, v_name); Insert Into propertyTable ( ID,parentID,readOnly,attType,dataType,selection,visible,propertyID,value ) values ( SEQPROPERTYTABLE.nextval,v_nid,0,2,1,0,1,925, null); Insert Into propertyTable ( ID,parentID,readOnly,attType,dataType,selection,visible,propertyID,value ) values ( SEQPROPERTYTABLE.nextval,v_nid,1,0,0,0,0,1,'0'); Insert Into propertyTable ( ID,parentID,readOnly,attType,dataType,selection,visible,propertyID,value ) values ( SEQPROPERTYTABLE.nextval,v_nid,1,0,0,0,0,2,'0'); Insert Into propertyTable ( ID,parentID,readOnly,attType,dataType,selection,visible,propertyID,value ) values ( SEQPROPERTYTABLE.nextval,v_nid,1,2,2,0,1,3,'50'); Insert Into propertyTable ( ID,parentID,readOnly,attType,dataType,selection,visible,propertyID,value ) values ( SEQPROPERTYTABLE.nextval,v_nid,0,2,1,0,1,5, null); Insert Into propertyTable ( ID,parentID,readOnly,attType,dataType,selection,visible,propertyID,value ) values ( SEQPROPERTYTABLE.nextval,v_nid,0,2,2,0,1,6,'50'); Insert Into propertyTable ( ID,parentID,readOnly,attType,dataType,selection,visible,propertyID,value ) values ( SEQPROPERTYTABLE.nextval,v_nid,0,2,2,0,0,7,'0'); Insert Into propertyTable ( ID,parentID,readOnly,attType,dataType,selection,visible,propertyID,value ) values ( SEQPROPERTYTABLE.nextval,v_nid,0,4,1,451,1,8,'0'); Insert Into propertyTable ( ID,parentID,readOnly,attType,dataType,selection,visible,propertyID,value ) values ( SEQPROPERTYTABLE.nextval,v_nid,0,4,1,451,1,9,'1'); Insert Into propertyTable ( ID,parentID,readOnly,attType,dataType,selection,visible,propertyID,value ) values ( SEQPROPERTYTABLE.nextval,v_nid,1,2,1,0,1,10,v_name); Insert Into propertyTable ( ID,parentID,readOnly,attType,dataType,selection,visible,propertyID,value ) values ( SEQPROPERTYTABLE.nextval,v_nid,1,0,0,0,0,11,'0'); Insert Into propertyTable ( ID,parentID,readOnly,attType,dataType,selection,visible,propertyID,value ) values ( SEQPROPERTYTABLE.nextval,v_nid,1,4,1,11743,1,14,'2'); Insert Into propertyTable ( ID,parentID,readOnly,attType,dataType,selection,visible,propertyID,value ) values ( SEQPROPERTYTABLE.nextval,v_nid,0,2,1,0,1,30, null); Insert Into propertyTable ( ID,parentID,readOnly,attType,dataType,selection,visible,propertyID,value ) values ( SEQPROPERTYTABLE.nextval,v_nid,0,2,1,0,1,38, null); Insert Into propertyTable ( ID,parentID,readOnly,attType,dataType,selection,visible,propertyID,value ) values ( SEQPROPERTYTABLE.nextval,v_nid,1,4,1,451,0,59,'1'); Insert Into propertyTable ( ID,parentID,readOnly,attType,dataType,selection,visible,propertyID,value ) values ( SEQPROPERTYTABLE.nextval,v_nid,1,4,1,451,0,60,'1'); Insert Into propertyTable ( ID,parentID,readOnly,attType,dataType,selection,visible,propertyID,value ) values ( SEQPROPERTYTABLE.nextval,v_nid,1,4,1,724,0,61, null); Insert Into propertyTable ( ID,parentID,readOnly,attType,dataType,selection,visible,propertyID,value ) values ( SEQPROPERTYTABLE.nextval,v_nid,1,2,1,0,0,232,'0'); Insert Into propertyTable ( ID,parentID,readOnly,attType,dataType,selection,visible,propertyID,value ) values ( SEQPROPERTYTABLE.nextval,v_nid,1,4,1,451,0,233,'1'); Insert Into propertyTable ( ID,parentID,readOnly,attType,dataType,selection,visible,propertyID,value ) values ( SEQPROPERTYTABLE.nextval,v_nid,0,4,1,12239,1,415,'13307'); Insert Into propertyTable ( ID,parentID,readOnly,attType,dataType,selection,visible,propertyID,value ) values ( SEQPROPERTYTABLE.nextval,v_nid,1,2,1,0,0,605,'0'); Insert Into propertyTable ( ID,parentID,readOnly,attType,dataType,selection,visible,propertyID,value ) values ( SEQPROPERTYTABLE.nextval,v_nid,0,4,1,451,1,610,'0'); Insert Into propertyTable ( ID,parentID,readOnly,attType,dataType,selection,visible,propertyID,value ) values ( SEQPROPERTYTABLE.nextval,v_nid,1,4,1,451,0,716,'1'); Insert Into propertyTable ( ID,parentID,readOnly,attType,dataType,selection,visible,propertyID,value ) values ( SEQPROPERTYTABLE.nextval,v_nid,0,4,1,451,1,795,'0'); Insert Into propertyTable ( ID,parentID,readOnly,attType,dataType,selection,visible,propertyID,value ) values ( SEQPROPERTYTABLE.nextval,v_nid,0,4,1,2000008821,1,864,'2'); Insert Into propertyTable ( ID,parentID,readOnly,attType,dataType,selection,visible,propertyID,value ) values ( SEQPROPERTYTABLE.nextval,v_nid,0,4,1,451,1,923,'0'); Insert Into propertyTable ( ID,parentID,readOnly,attType,dataType,selection,visible,propertyID,value ) values ( SEQPROPERTYTABLE.nextval,v_nid,0,4,1,451,0,719,'0'); Insert Into tableInfo ( tabID,tableID,classID,att,ordering ) values ( 2473005,1501,2473002,v_nid,9999); commit; END createP3Text; / BEGIN FOR i in 1..1000 LOOP createP3Text('MyText' || i); END LOOP; END; / DROP PROCEDURE createP3Text; COMMIT; Now restart Agile Server and check the Server's log, we noticed below: ***** Node Created : 85625 ***** Property Created : 184579 +++++++++++++++++++++++++++++++++++++ + Agile PLM Server Starting Up... + +++++++++++++++++++++++++++++++++++++ However the previously log before batch SQL is ***** Node Created : 84625 ***** Property Created : 157579 +++++++++++++++++++++++++++++++++++++ + Agile PLM Server Starting Up... + +++++++++++++++++++++++++++++++++++++ Obviously we successfully imported 1000 (85625-84625) attributes. Now go to JavaClient and confirm if we have them or not. Theoretically we are able to open such item object and see all these 1000 attributes and their values, but we get below error. We have no error tips in server log. But never mind we have the Java Console for JavaClient. If to open the same item in JavaClient we get a clear error and detailed trace in Java Console. ORA-01795: maximum number of expressions in a list is 1000 java.sql.SQLException: ORA-01795: maximum number of expressions in a list is 1000 at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:125) ... ... at weblogic.jdbc.wrapper.PreparedStatement.executeQuery(PreparedStatement.java:128) at com.agile.pc.cmserver.base.AgileFlexUtil.setFlexValuesForOneRowTable(AgileFlexUtil.java:1104) at com.agile.pc.cmserver.base.BaseFlexTableDAO.loadExtraFlexAttValues(BaseFlexTableDAO.java:111) at com.agile.pc.cmserver.base.BasePageThreeDAO.loadTable(BasePageThreeDAO.java:108) If you are interested in the background of the problem, you may de-compile the class com.agile.pc.cmserver.base.AgileFlexUtil.setFlexValuesForOneRowTable and find the root cause that Agile happens to hit Oracle Database's limitation that more than 1000 values in the "IN" clause. Check here http://ora-01795.ora-code.com If you need Oracle Agile's final solution, please contact Oracle Agile Support. Performance Below two screenshot are jvm heap usage from before-SQL and after-SQL. We can see there is no big memory gap between two cases. So definitely there is no performance impact to Agile Application Server unless you have more than 1000 attributes for EACH of your dozens of subclasses. And for client, 1000 attributes should not impact the browser's performance because in HTML we only use dt and dd for each attribute's pair: label and value. It is quite lightweight.

Read the article
Andengine put bullet to pull, when it leaves screen

- by Ashot

i'm creating a bullet with physics body. Bullet class (extends Sprite class) has die() method, which unregister physics connector, hide sprite and put it in pull public void die() { Log.d("bulletDie", "See you in hell!"); if (this.isVisible()) { this.setVisible(false); mPhysicsWorld.unregisterPhysicsConnector(physicsConnector); physicsConnector.setUpdatePosition(false); body.setActive(false); this.setIgnoreUpdate(true); bulletsPool.recyclePoolItem(this); } } in onUpdate method of PhysicsConnector i executes die method, when sprite leaves screen physicsConnector = new PhysicsConnector(this,body,true,false) { @Override public void onUpdate(final float pSecondsElapsed) { super.onUpdate(pSecondsElapsed); if (!camera.isRectangularShapeVisible(_bullet)) { Log.d("bulletDie","Dead?"); _bullet.die(); } } }; it works as i expected, but _bullet.die() executes TWICE. what i`m doing wrong and is it right way to hide sprites? here is full code of Bullet class (it is inner class of class that represents player) private class Bullet extends Sprite implements PhysicsConstants { private final Body body; private final PhysicsConnector physicsConnector; private final Bullet _bullet; private int id; public Bullet(float x, float y, ITextureRegion texture, VertexBufferObjectManager vertexBufferObjectManager) { super(x,y,texture,vertexBufferObjectManager); _bullet = this; id = bulletId++; body = PhysicsFactory.createCircleBody(mPhysicsWorld, this, BodyDef.BodyType.DynamicBody, bulletFixture); physicsConnector = new PhysicsConnector(this,body,true,false) { @Override public void onUpdate(final float pSecondsElapsed) { super.onUpdate(pSecondsElapsed); if (!camera.isRectangularShapeVisible(_bullet)) { Log.d("bulletDie","Dead?"); Log.d("bulletDie",id+""); _bullet.die(); } } }; mPhysicsWorld.registerPhysicsConnector(physicsConnector); $this.getParent().attachChild(this); } public void reset() { final float angle = canon.getRotation(); final float x = (float) ((Math.cos(MathUtils.degToRad(angle))*radius) + centerX) / PIXEL_TO_METER_RATIO_DEFAULT; final float y = (float) ((Math.sin(MathUtils.degToRad(angle))*radius) + centerY) / PIXEL_TO_METER_RATIO_DEFAULT; this.setVisible(true); this.setIgnoreUpdate(false); body.setActive(true); mPhysicsWorld.registerPhysicsConnector(physicsConnector); body.setTransform(new Vector2(x,y),0); } public Body getBody() { return body; } public void setLinearVelocity(Vector2 velocity) { body.setLinearVelocity(velocity); } public void die() { Log.d("bulletDie", "See you in hell!"); if (this.isVisible()) { this.setVisible(false); mPhysicsWorld.unregisterPhysicsConnector(physicsConnector); physicsConnector.setUpdatePosition(false); body.setActive(false); this.setIgnoreUpdate(true); bulletsPool.recyclePoolItem(this); } } }

Read the article
More Fun With Math

- by PointsToShare

More Fun with Math The runaway student – three different ways of solving one problem Here is a problem I read in a Russian site: A student is running away. He is moving at 1 mph. Pursuing him are a lion, a tiger and his math teacher. The lion is 40 miles behind and moving at 6 mph. The tiger is 28 miles behind and moving at 4 mph. His math teacher is 30 miles behind and moving at 5 mph. Who will catch him first? Analysis Obviously we have a set of three problems. They are all basically the same, but the details are different. The problems are of the same class. Here is a little excursion into computer science. One of the things we strive to do is to create solutions for classes of problems rather than individual problems. In your daily routine, you call it re-usability. Not all classes of problems have such solutions. If a class has a general (re-usable) solution, it is called computable. Otherwise it is unsolvable. Within unsolvable classes, we may still solve individual (some but not all) problems, albeit with different approaches to each. Luckily the vast majority of our daily problems are computable, and the 3 problems of our runaway student belong to a computable class. So, let’s solve for the catch-up time by the math teacher, after all she is the most frightening. She might even make the poor runaway solve this very problem – perish the thought! Method 1 – numerical analysis. At 30 miles and 5 mph, it’ll take her 6 hours to come to where the student was to begin with. But by then the student has advanced by 6 miles. 6 miles require 6/5 hours, but by then the student advanced by another 6/5 of a mile as well. And so on and so forth. So what are we to do? One way is to write code and iterate it until we have solved it. But this is an infinite process so we’ll end up with an infinite loop. So what to do? We’ll use the principles of numerical analysis. Any calculator – your computer included – has a limited number of digits. A double floating point number is good for about 14 digits. Nothing can be computed at a greater accuracy than that. This means that we will not iterate ad infinidum, but rather to the point where 2 consecutive iterations yield the same result. When we do financial computations, we don’t even have to go that far. We stop at the 10th of a penny. It behooves us here to stop at a 10th of a second (100 milliseconds) and this will how we will avoid an infinite loop. Interestingly this alludes to the Zeno paradoxes of motion – in particular “Achilles and the Tortoise”. Zeno says exactly the same. To catch the tortoise, Achilles must always first come to where the tortoise was, but the tortoise keeps moving – hence Achilles will never catch the tortoise and our math teacher (or lion, or tiger) will never catch the student, or the policeman the thief. Here is my resolution to the paradox. The distance and time in each step are smaller and smaller, so the student will be caught. The only thing that is infinite is the iterative solution. The race is a convergent geometric process so the steps are diminishing, but each step in the solution takes the same amount of effort and time so with an infinite number of steps, we’ll spend an eternity solving it. This BTW is an original thought that I have never seen before. But I digress. Let’s simply write the code to solve the problem. To make sure that it runs everywhere, I’ll do it in JavaScript. function LongCatchUpTime(D, PV, FV) // D is Distance; PV is Pursuers Velocity; FV is Fugitive’ Velocity { var t = 0; var T = 0; var d = parseFloat(D); var pv = parseFloat (PV); var fv = parseFloat (FV); t = d / pv; while (t > 0.000001) //a 10th of a second is 1/36,000 of an hour, I used 1/100,000 { T = T + t; d = t * fv; t = d / pv; } return T; } By and large, the higher the Pursuer’s velocity relative to the fugitive, the faster the calculation. Solving this with the 10th of a second limit yields: 7.499999232000001 Method 2 – Geometric Series. Each step in the iteration above is smaller than the next. As you saw, we stopped iterating when the last step was small enough, small enough not to really matter. When we have a sequence of numbers in which the ratio of each number to its predecessor is fixed we call the sequence geometric. When we are looking at the sum of sequence, we call the sequence of sums series. Now let’s look at our student and teacher. The teacher runs 5 times faster than the student, so with each iteration the distance between them shrinks to a fifth of what it was before. This is a fixed ratio so we deal with a geometric series. We normally designate this ratio as q and when q is less than 1 (0 < q < 1) the sum of + … + is – 1) / (q – 1). When q is less than 1, it is easier to use ) / (1 - q). Now, the steps are 6 hours then 6/5 hours then 6/5*5 and so on, so q = 1/5. And the whole series is multiplied by 6. Also because q is less than 1 , 1/ diminishes to 0. So the sum is just / (1 - q). or 1/ (1 – 1/5) = 1 / (4/5) = 5/4. This times 6 yields 7.5 hours. We can now continue with some algebra and take it back to a simpler formula. This is arduous and I am not going to do it here. Instead let’s do some simpler algebra. Method 3 – Simple Algebra. If the time to capture the fugitive is T and the fugitive travels at 1 mph, then by the time the pursuer catches him he travelled additional T miles. Time is distance divided by speed, so…. (D + T)/V = T thus D + T = VT and D = VT – T = (V – 1)T and T = D/(V – 1) This “strangely” coincides with the solution we just got from the geometric sequence. This is simpler ad faster. Here is the corresponding code. function ShortCatchUpTime(D, PV, FV) { var d = parseFloat(D); var pv = parseFloat (PV); var fv = parseFloat (FV); return d / (pv - fv); } The code above, for both the iterative solution and the algebraic solution are actually for a larger class of problems. In our original problem the student’s velocity (speed) is 1 mph. In the code it may be anything as long as it is less than the pursuer’s velocity. As long as PV > FV, the pursuer will catch up. Here is the really general formula: T = D / (PV – FV) Finally, let’s run the program for each of the pursuers. It could not be worse. I know he’d rather be eaten alive than suffering through yet another math lesson. See the code run? Select “Catch Up Time” in www.mgsltns.com/games.htm The host is running on Unix, so the link is case sensitive. That’s All Folks

Read the article
Atmospheric scattering sky from space artifacts

- by ollipekka

I am in the process of implementing atmospheric scattering of a planets from space. I have been using Sean O'Neil's shaders from http://http.developer.nvidia.com/GPUGems2/gpugems2_chapter16.html as a starting point. I have pretty much the same problem related to fCameraAngle except with SkyFromSpace shader as opposed to GroundFromSpace shader as here: http://www.gamedev.net/topic/621187-sean-oneils-atmospheric-scattering/ I get strange artifacts with sky from space shader when not using fCameraAngle = 1 in the inner loop. What is the cause of these artifacts? The artifacts disappear when fCameraAngle is limtied to 1. I also seem to lack the hue that is present in O'Neil's sandbox (http://sponeil.net/downloads.htm) Camera position X=0, Y=0, Z=500. GroundFromSpace on the left, SkyFromSpace on the right. Camera position X=500, Y=500, Z=500. GroundFromSpace on the left, SkyFromSpace on the right. I've found that the camera angle seems to handled very differently depending the source: In the original shaders the camera angle in SkyFromSpaceShader is calculated as: float fCameraAngle = dot(v3Ray, v3SamplePoint) / fHeight; Whereas in ground from space shader the camera angle is calculated as: float fCameraAngle = dot(-v3Ray, v3Pos) / length(v3Pos); However, various sources online tinker with negating the ray. Why is this? Here is a C# Windows.Forms project that demonstrates the problem and that I've used to generate the images: https://github.com/ollipekka/AtmosphericScatteringTest/ Update: I have found out from the ScatterCPU project found on O'Neil's site that the camera ray is negated when the camera is above the point being shaded so that the scattering is calculated from point to the camera. Changing the ray direction indeed does remove artifacts, but introduces other problems as illustrated here: Furthermore, in the ScatterCPU project, O'Neil guards against situations where optical depth for light is less than zero: float fLightDepth = Scale(fLightAngle, fScaleDepth); if (fLightDepth < float.Epsilon) { continue; } As pointed out in the comments, along with these new artifacts this still leaves the question, what is wrong with the images where camera is positioned at 500, 500, 500? It feels like the halo is focused on completely wrong part of the planet. One would expect that the light would be closer to the spot where the sun should hits the planet, rather than where it changes from day to night. The github project has been updated to reflect changes in this update.

Read the article
More Fun with C# Iterators and Generators

- by James Michael Hare

In my last post, I talked quite a bit about iterators and how they can be really powerful tools for filtering a list of items down to a subset of items. This had both pros and cons over returning a full collection, which, in summary, were: Pros: If traversal is only partial, does not have to visit rest of collection. If evaluation method is costly, only incurs that cost on elements visited. Adds little to no garbage collection pressure. Cons: Very slight performance impact if you know caller will always consume all items in collection. And as we saw in the last post, that con for the cost was very, very small and only really became evident on very tight loops consuming very large lists completely. One of the key items to note, though, is the garbage! In the traditional (return a new collection) method, if you have a 1,000,000 element collection, and wish to transform or filter it in some way, you have to allocate space for that copy of the collection. That is, say you have a collection of 1,000,000 items and you want to double every item in the collection. Well, that means you have to allocate a collection to hold those 1,000,000 items to return, which is a lot especially if you are just going to use it once and toss it. Iterators, though, don't have this problem. Each time you visit the node, it would return the doubled value of the node (in this example) and not allocate a second collection of 1,000,000 doubled items. Do you see the distinction? In both cases, we're consuming 1,000,000 items. But in one case we pass back each doubled item which is just an int (for example's sake) on the stack and in the other case, we allocate a list containing 1,000,000 items which then must be garbage collected. So iterators in C# are pretty cool, eh? Well, here's one more thing a C# iterator can do that a traditional "return a new collection" transformation can't! It can return **unbounded** collections! I know, I know, that smells a lot like an infinite loop, eh? Yes and no. Basically, you're relying on the caller to put the bounds on the list, and as long as the caller doesn't you keep going. Consider this example: public static class Fibonacci { // returns the infinite fibonacci sequence public static IEnumerable<int> Sequence() { int iteration = 0; int first = 1; int second = 1; int current = 0; while (true) { if (iteration++ < 2) { current = 1; } else { current = first + second; second = first; first = current; } yield return current; } } } Whoa, you say! Yes, that's an infinite loop! What the heck is going on there? Yes, that was intentional. Would it be better to have a fibonacci sequence that returns only a specific number of items? Perhaps, but that wouldn't give you the power to defer the execution to the caller. The beauty of this function is it is as infinite as the sequence itself! The fibonacci sequence is unbounded, and so is this method. It will continue to return fibonacci numbers for as long as you ask for them. Now that's not something you can do with a traditional method that would return a collection of ints representing each number. In that case you would eventually run out of memory as you got to higher and higher numbers. This method, though, never runs out of memory. Now, that said, you do have to know when you use it that it is an infinite collection and bound it appropriately. Fortunately, Linq provides a lot of these extension methods for you! Let's say you only want the first 10 fibonacci numbers: foreach(var fib in Fibonacci.Sequence().Take(10)) { Console.WriteLine(fib); } Or let's say you only want the fibonacci numbers that are less than 100: foreach(var fib in Fibonacci.Sequence().TakeWhile(f => f < 100)) { Console.WriteLine(fib); } So, you see, one of the nice things about iterators is their power to work with virtually any size (even infinite) collections without adding the garbage collection overhead of making new collections. You can also do fun things like this to make a more "fluent" interface for for loops: // A set of integer generator extension methods public static class IntExtensions { // Begins counting to inifity, use To() to range this. public static IEnumerable<int> Every(this int start) { // deliberately avoiding condition because keeps going // to infinity for as long as values are pulled. for (var i = start; ; ++i) { yield return i; } } // Begins counting to infinity by the given step value, use To() to public static IEnumerable<int> Every(this int start, int byEvery) { // deliberately avoiding condition because keeps going // to infinity for as long as values are pulled. for (var i = start; ; i += byEvery) { yield return i; } } // Begins counting to inifity, use To() to range this. public static IEnumerable<int> To(this int start, int end) { for (var i = start; i <= end; ++i) { yield return i; } } // Ranges the count by specifying the upper range of the count. public static IEnumerable<int> To(this IEnumerable<int> collection, int end) { return collection.TakeWhile(item => item <= end); } } Note that there are two versions of each method. One that starts with an int and one that starts with an IEnumerable<int>. This is to allow more power in chaining from either an existing collection or from an int. This lets you do things like: // count from 1 to 30 foreach(var i in 1.To(30)) { Console.WriteLine(i); } // count from 1 to 10 by 2s foreach(var i in 0.Every(2).To(10)) { Console.WriteLine(i); } // or, if you want an infinite sequence counting by 5s until something inside breaks you out... foreach(var i in 0.Every(5)) { if (someCondition) { break; } ... } Yes, those are kinda play functions and not particularly useful, but they show some of the power of generators and extension methods to form a fluid interface. So what do you think? What are some of your favorite generators and iterators?

Read the article
Full-text Indexing Books Online

- by Most Valuable Yak (Rob Volk)

While preparing for a recent SQL Saturday presentation, I was struck by a crazy idea (shocking, I know): Could someone import the content of SQL Server Books Online into a database and apply full-text indexing to it? The answer is yes, and it's really quite easy to do. The first step is finding the installed help files. If you have SQL Server 2012, BOL is installed under the Microsoft Help Library. You can find the install location by opening SQL Server Books Online and clicking the gear icon for the Help Library Manager. When the new window pops up click the Settings link, you'll get the following: You'll see the path under Library Location. Once you navigate to that path you'll have to drill down a little further, to C:\ProgramData\Microsoft\HelpLibrary\content\Microsoft\store. This is where the help file content is kept if you downloaded it for offline use. Depending on which products you've downloaded help for, you may see a few hundred files. Fortunately they're named well and you can easily find the "SQL_Server_Denali_Books_Online_" files. We are interested in the .MSHC files only, and can skip the Installation and Developer Reference files. Despite the .MHSC extension, these files are compressed with the standard Zip format, so your favorite archive utility (WinZip, 7Zip, WinRar, etc.) can open them. When you do, you'll see a few thousand files in the archive. We are only interested in the .htm files, but there's no harm in extracting all of them to a folder. 7zip provides a command-line utility and the following will extract to a D:\SQLHelp folder previously created: 7z e –oD:\SQLHelp "C:\ProgramData\Microsoft\HelpLibrary\content\Microsoft\store\SQL_Server_Denali_Books_Online_B780_SQL_110_en-us_1.2.mshc" *.htm Well that's great Rob, but how do I put all those files into a full-text index? I'll tell you in a second, but first we have to set up a few things on the database side. I'll be using a database named Explore (you can certainly change that) and the following setup is a fragment of the script I used in my presentation: USE Explore; GO CREATE SCHEMA help AUTHORIZATION dbo; GO -- Create default fulltext catalog for later FT indexes CREATE FULLTEXT CATALOG FTC AS DEFAULT; GO CREATE TABLE help.files(file_id int not null IDENTITY(1,1) CONSTRAINT PK_help_files PRIMARY KEY, path varchar(256) not null CONSTRAINT UNQ_help_files_path UNIQUE, doc_type varchar(6) DEFAULT('.xml'), content varbinary(max) not null); CREATE FULLTEXT INDEX ON help.files(content TYPE COLUMN doc_type LANGUAGE 1033) KEY INDEX PK_help_files; This will give you a table, default full-text catalog, and full-text index on that table for the content you're going to insert. I'll be using the command line again for this, it's the easiest method I know: for %a in (D:\SQLHelp\*.htm) do sqlcmd -S. -E -d Explore -Q"set nocount on;insert help.files(path,content) select '%a', cast(c as varbinary(max)) from openrowset(bulk '%a', SINGLE_CLOB) as c(c)" You'll need to copy and run that as one line in a command prompt. I'll explain what this does while you run it and watch several thousand files get imported: The "for" command allows you to loop over a collection of items. In this case we want all the .htm files in the D:\SQLHelp folder. For each file it finds, it will assign the full path and file name to the %a variable. In the "do" clause, we'll specify another command to be run for each iteration of the loop. I make a call to "sqlcmd" in order to run a SQL statement. I pass in the name of the server (-S.), where "." represents the local default instance. I specify -d Explore as the database, and -E for trusted connection. I then use -Q to run a query that I enclose in double quotes. The query uses OPENROWSET(BULK…SINGLE_CLOB) to open the file as a data source, and to treat it as a single character large object. In order for full-text indexing to work properly, I have to convert the text content to varbinary. I then INSERT these contents along with the full path of the file into the help.files table created earlier. This process continues for each file in the folder, creating one new row in the table. And that's it! 5 SQL Statements and 2 command line statements to unzip and import SQL Server Books Online! In case you're wondering why I didn't use FILESTREAM or FILETABLE, it's simply because I haven't learned them…yet. I may return to this blog after I figure that out and update it with the steps to do so. I believe that will make it even easier. In the spirit of exploration, I'll leave you to work on some fulltext queries of this content. I also recommend playing around with the sys.dm_fts_xxxx DMVs (I particularly like sys.dm_fts_index_keywords, it's pretty interesting). There are additional example queries in the download material for my presentation linked above. Many thanks to Kevin Boles (t) for his advice on (re)checking the content of the help files. Don't let that .htm extension fool you! The 2012 help files are actually XML, and you'd need to specify '.xml' in your document type column in order to extract the full-text keywords. (You probably noticed this in the default definition for the doc_type column.) You can query sys.fulltext_document_types to get a complete list of the types that can be full-text indexed. I also need to thank Hilary Cotter for giving me the original idea. I believe he used MSDN content in a full-text index for an article from waaaaaaaaaaay back, that I can't find now, and had forgotten about until just a few days ago. He is also co-author of Pro Full-Text Search in SQL Server 2008, which I highly recommend. He also has some FTS articles on Simple Talk: http://www.simple-talk.com/sql/learn-sql-server/sql-server-full-text-search-language-features/ http://www.simple-talk.com/sql/learn-sql-server/sql-server-full-text-search-language-features,-part-2/

Read the article

Search Results

Search found 10235 results on 410 pages for 'inner loop'.

Page 221/410 | < Previous Page | 217 218 219 220 221 222 223 224 225 226 227 228 | Next Page >

- by ggarcia24

- by Reed

- by Reed

- by Paul White

- by pinaldave

- by jamiet

- by Raven Dreamer

- by David Allan

- by [email protected]

- by EltonStoneman

- by pinaldave

- by Boonei

- by BuckWoody

- by BuckWoody

- by mehfuzh

- by Stephen.Walther

- by pinaldave

- by vik20000in

- by Kevin Yang

- by Jie Chen

- by Ashot

- by PointsToShare

- by ollipekka

- by James Michael Hare

- by Most Valuable Yak (Rob Volk)

< Previous Page | 217 218 219 220 221 222 223 224 225 226 227 228 | Next Page >