Search Results

Search found 19308 results on 773 pages for 'network efficiency'.

Page 703/773 | < Previous Page | 699 700 701 702 703 704 705 706 707 708 709 710 | Next Page >

Looking for a nice and handy bandwidth limiter

- by Spirit

Here's the thing. My ISP are f*gs. Most of the time the network is realy good, but then again there are times when it sucks. They say we have 20mbp/s. Me and my brother are usually playing dota and it's fine. One of us plays the other's watching streaming or both of us are playing or something like that. But sometimes like one of these days one is lagging even if the other is watching youtube at 360p quality. So I do know that there are many bandwidth limiters out there. But what could you recommend given my situation. I like both of us to be able to install it and it would be good idea to have something like ON/OFF switch. When he is not here I would like to turn that thing off. But when one of us plays I would like to turn it ON and whatever I do, to be limited to 10 or 8mbps. That way we will not interrupt each other. Thanks guys :)

Read the article
I'm having trouble getting my server to appear online.

- by JMRboosties

Total newb question I'm sure. First I had installed WAMP (http://www.wampserver.com), and I was able to access my pages from other computers in my router network, and the virtual device used to debug Android programs (the purpose of my having a server). This functionality failed, however, at some point over these past few days. While my own browser displays the pages just fine, other computers, my Android phone (on our room's wifi), and my virtual device are no longer able to connect to my pages. I had not made any changes in the settings. I uninstalled WAMP and installed EasyPHP. However, the problem was not resolved. I know this is rather vague, but does anyone here have an idea of what may have happened? I forwarded both port 80 (I know its the default HTTP port, I did it just to be safe), and now port 8888 which EasyPHP uses. I turned my firewall on my hosting computer off for good measure. I cannot access my pages from neither remote computers or computers using my router. Any ideas you may have on how to resolve this would be awesome, thanks a lot. And if you need anymore info please tell me.

Read the article
100% CPU use when new usb device plugged in - services.exe / Windows Server 2003

- by Will3265

On my server I am trying to install a new usb drive but all that happens is that the system starts using huge amounts of processor cycles with services.exe. On closer inspection with process explorer there is a thread called umpnpmgr.dll using most of the services.exe processor time. I left it for a half hour and still nothing happened. Rebooted and tried again, same result. Tried a different usb drive, then a flash drive but still same issue. Tried updating driver but it said the update function was already in action. I have used process explorer to kill the thread now so the server can still perform its intended functions. Any device that was previously installed before this began happening will still work but any device new to the system will not. My question(s) is/are: Is there a way to manually install the device into the registry so Windows thinks it is a previously installed device? Or can this problem be repaired through anything other than a reinstall? To do a reinstall would mean backing up large amounts of data which is hard with a usb drive and insufficient space on all other network machines. Any help would be greatly appreciated. William

Read the article
Can we add new attribute or change type of existing attribute to a "Referenced Element"?

- by JSteve

In my XML schema I have an element being referenced tens of times by other elements but with different enumerated values for one of its attribute. For now, instead of creating this element in global space and referencing it later, I am creating a new instance wherever it is needed. This approach has increased my schema size enormously because of repeated creation of almost same element many times. It also may have adverse effect on efficiency of the schema. The only way that I see is to create element once and then reference it many times but my problem is: one of the attribute of this referenced element is required to have a different set of enumerations for each referencing element. My question is: Is it possible to to add an attribute to a "Referenced Element" in XML Schema? Something like this: <?xml version="1.0" encoding="UTF-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.myDomain.com" xmlns="http://www.myDomain.com" elementFormDefault="qualified"> <xs:simpleType name="myValues1"> <xs:restriction base="xs:string"> <xs:enumeration value="value1" /> <xs:enumeration value="value2" /> </xs:restriction> </xs:simpleType> <xs:element name="myElement"> <xs:complexType mixed="true"> <xs:attribute name="attr1" type="xs:string" /> <xs:attribute name="attr2" type="xs:string" /> </xs:complexType> </xs:element> <xs:element name="MainElement1"> <xs:complexType> <xs:sequence> <xs:element ref="myElement"> <xs:complexType> <xs:attribute name="myAtt" type="myValues1" /> </xs:complexType> </xs:element> </xs:sequence> <xs:attribute name="mainAtt1" /> </xs:complexType> </xs:element> </xs:schema> Or can we change type of an existing attribute of a "Referenced Element" in XML Schema? something like this: <?xml version="1.0" encoding="UTF-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.myDomain.com" xmlns="http://www.myDomain.com" elementFormDefault="qualified"> <xs:simpleType name="myValues1"> <xs:restriction base="xs:string"> <xs:enumeration value="value1" /> <xs:enumeration value="value2" /> </xs:restriction> </xs:simpleType> <xs:simpleType name="myValues2"> <xs:restriction base="xs:string"> <xs:enumeration value="value3" /> <xs:enumeration value="value4" /> </xs:restriction> </xs:simpleType> <xs:element name="myElement"> <xs:complexType mixed="true"> <xs:attribute name="attr1" type="xs:string" /> <xs:attribute name="attr2" type="xs:string" /> <xs:attribute name="myAtt" type="myValues1" /> </xs:complexType> </xs:element> <xs:element name="MainElement1"> <xs:complexType> <xs:sequence> <xs:element ref="myElement"> <xs:complexType> <xs:attribute name="myAtt" type="myValues2" /> </xs:complexType> </xs:element> </xs:sequence> <xs:attribute name="mainAtt1" /> </xs:complexType> </xs:element> </xs:schema>

Read the article
Improving Partitioned Table Join Performance

- by Paul White

The query optimizer does not always choose an optimal strategy when joining partitioned tables. This post looks at an example, showing how a manual rewrite of the query can almost double performance, while reducing the memory grant to almost nothing. Test Data The two tables in this example use a common partitioning partition scheme. The partition function uses 41 equal-size partitions: CREATE PARTITION FUNCTION PFT (integer) AS RANGE RIGHT FOR VALUES ( 125000, 250000, 375000, 500000, 625000, 750000, 875000, 1000000, 1125000, 1250000, 1375000, 1500000, 1625000, 1750000, 1875000, 2000000, 2125000, 2250000, 2375000, 2500000, 2625000, 2750000, 2875000, 3000000, 3125000, 3250000, 3375000, 3500000, 3625000, 3750000, 3875000, 4000000, 4125000, 4250000, 4375000, 4500000, 4625000, 4750000, 4875000, 5000000 ); GO CREATE PARTITION SCHEME PST AS PARTITION PFT ALL TO ([PRIMARY]); There two tables are: CREATE TABLE dbo.T1 ( TID integer NOT NULL IDENTITY(0,1), Column1 integer NOT NULL, Padding binary(100) NOT NULL DEFAULT 0x, CONSTRAINT PK_T1 PRIMARY KEY CLUSTERED (TID) ON PST (TID) ); CREATE TABLE dbo.T2 ( TID integer NOT NULL, Column1 integer NOT NULL, Padding binary(100) NOT NULL DEFAULT 0x, CONSTRAINT PK_T2 PRIMARY KEY CLUSTERED (TID, Column1) ON PST (TID) ); The next script loads 5 million rows into T1 with a pseudo-random value between 1 and 5 for Column1. The table is partitioned on the IDENTITY column TID: INSERT dbo.T1 WITH (TABLOCKX) (Column1) SELECT (ABS(CHECKSUM(NEWID())) % 5) + 1 FROM dbo.Numbers AS N WHERE n BETWEEN 1 AND 5000000; In case you don’t already have an auxiliary table of numbers lying around, here’s a script to create one with 10 million rows: CREATE TABLE dbo.Numbers (n bigint PRIMARY KEY); WITH L0 AS(SELECT 1 AS c UNION ALL SELECT 1), L1 AS(SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B), L2 AS(SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B), L3 AS(SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B), L4 AS(SELECT 1 AS c FROM L3 AS A CROSS JOIN L3 AS B), L5 AS(SELECT 1 AS c FROM L4 AS A CROSS JOIN L4 AS B), Nums AS(SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS n FROM L5) INSERT dbo.Numbers WITH (TABLOCKX) SELECT TOP (10000000) n FROM Nums ORDER BY n OPTION (MAXDOP 1); Table T1 contains data like this: Next we load data into table T2. The relationship between the two tables is that table 2 contains ‘n’ rows for each row in table 1, where ‘n’ is determined by the value in Column1 of table T1. There is nothing particularly special about the data or distribution, by the way. INSERT dbo.T2 WITH (TABLOCKX) (TID, Column1) SELECT T.TID, N.n FROM dbo.T1 AS T JOIN dbo.Numbers AS N ON N.n >= 1 AND N.n <= T.Column1; Table T2 ends up containing about 15 million rows: The primary key for table T2 is a combination of TID and Column1. The data is partitioned according to the value in column TID alone. Partition Distribution The following query shows the number of rows in each partition of table T1: SELECT PartitionID = CA1.P, NumRows = COUNT_BIG(*) FROM dbo.T1 AS T CROSS APPLY (VALUES ($PARTITION.PFT(TID))) AS CA1 (P) GROUP BY CA1.P ORDER BY CA1.P; There are 40 partitions containing 125,000 rows (40 * 125k = 5m rows). The rightmost partition remains empty. The next query shows the distribution for table 2: SELECT PartitionID = CA1.P, NumRows = COUNT_BIG(*) FROM dbo.T2 AS T CROSS APPLY (VALUES ($PARTITION.PFT(TID))) AS CA1 (P) GROUP BY CA1.P ORDER BY CA1.P; There are roughly 375,000 rows in each partition (the rightmost partition is also empty): Ok, that’s the test data done. Test Query and Execution Plan The task is to count the rows resulting from joining tables 1 and 2 on the TID column: SET STATISTICS IO ON; DECLARE @s datetime2 = SYSUTCDATETIME(); SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID; SELECT DATEDIFF(Millisecond, @s, SYSUTCDATETIME()); SET STATISTICS IO OFF; The optimizer chooses a plan using parallel hash join, and partial aggregation: The Plan Explorer plan tree view shows accurate cardinality estimates and an even distribution of rows across threads (click to enlarge the image): With a warm data cache, the STATISTICS IO output shows that no physical I/O was needed, and all 41 partitions were touched: Running the query without actual execution plan or STATISTICS IO information for maximum performance, the query returns in around 2600ms. Execution Plan Analysis The first step toward improving on the execution plan produced by the query optimizer is to understand how it works, at least in outline. The two parallel Clustered Index Scans use multiple threads to read rows from tables T1 and T2. Parallel scan uses a demand-based scheme where threads are given page(s) to scan from the table as needed. This arrangement has certain important advantages, but does result in an unpredictable distribution of rows amongst threads. The point is that multiple threads cooperate to scan the whole table, but it is impossible to predict which rows end up on which threads. For correct results from the parallel hash join, the execution plan has to ensure that rows from T1 and T2 that might join are processed on the same thread. For example, if a row from T1 with join key value ‘1234’ is placed in thread 5’s hash table, the execution plan must guarantee that any rows from T2 that also have join key value ‘1234’ probe thread 5’s hash table for matches. The way this guarantee is enforced in this parallel hash join plan is by repartitioning rows to threads after each parallel scan. The two repartitioning exchanges route rows to threads using a hash function over the hash join keys. The two repartitioning exchanges use the same hash function so rows from T1 and T2 with the same join key must end up on the same hash join thread. Expensive Exchanges This business of repartitioning rows between threads can be very expensive, especially if a large number of rows is involved. The execution plan selected by the optimizer moves 5 million rows through one repartitioning exchange and around 15 million across the other. As a first step toward removing these exchanges, consider the execution plan selected by the optimizer if we join just one partition from each table, disallowing parallelism: SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = 1 AND $PARTITION.PFT(T2.TID) = 1 OPTION (MAXDOP 1); The optimizer has chosen a (one-to-many) merge join instead of a hash join. The single-partition query completes in around 100ms. If everything scaled linearly, we would expect that extending this strategy to all 40 populated partitions would result in an execution time around 4000ms. Using parallelism could reduce that further, perhaps to be competitive with the parallel hash join chosen by the optimizer. This raises a question. If the most efficient way to join one partition from each of the tables is to use a merge join, why does the optimizer not choose a merge join for the full query? Forcing a Merge Join Let’s force the optimizer to use a merge join on the test query using a hint: SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID OPTION (MERGE JOIN); This is the execution plan selected by the optimizer: This plan results in the same number of logical reads reported previously, but instead of 2600ms the query takes 5000ms. The natural explanation for this drop in performance is that the merge join plan is only using a single thread, whereas the parallel hash join plan could use multiple threads. Parallel Merge Join We can get a parallel merge join plan using the same query hint as before, and adding trace flag 8649: SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID OPTION (MERGE JOIN, QUERYTRACEON 8649); The execution plan is: This looks promising. It uses a similar strategy to distribute work across threads as seen for the parallel hash join. In practice though, performance is disappointing. On a typical run, the parallel merge plan runs for around 8400ms; slower than the single-threaded merge join plan (5000ms) and much worse than the 2600ms for the parallel hash join. We seem to be going backwards! The logical reads for the parallel merge are still exactly the same as before, with no physical IOs. The cardinality estimates and thread distribution are also still very good (click to enlarge): A big clue to the reason for the poor performance is shown in the wait statistics (captured by Plan Explorer Pro): CXPACKET waits require careful interpretation, and are most often benign, but in this case excessive waiting occurs at the repartitioning exchanges. Unlike the parallel hash join, the repartitioning exchanges in this plan are order-preserving ‘merging’ exchanges (because merge join requires ordered inputs): Parallelism works best when threads can just grab any available unit of work and get on with processing it. Preserving order introduces inter-thread dependencies that can easily lead to significant waits occurring. In extreme cases, these dependencies can result in an intra-query deadlock, though the details of that will have to wait for another time to explore in detail. The potential for waits and deadlocks leads the query optimizer to cost parallel merge join relatively highly, especially as the degree of parallelism (DOP) increases. This high costing resulted in the optimizer choosing a serial merge join rather than parallel in this case. The test results certainly confirm its reasoning. Collocated Joins In SQL Server 2008 and later, the optimizer has another available strategy when joining tables that share a common partition scheme. This strategy is a collocated join, also known as as a per-partition join. It can be applied in both serial and parallel execution plans, though it is limited to 2-way joins in the current optimizer. Whether the optimizer chooses a collocated join or not depends on cost estimation. The primary benefits of a collocated join are that it eliminates an exchange and requires less memory, as we will see next. Costing and Plan Selection The query optimizer did consider a collocated join for our original query, but it was rejected on cost grounds. The parallel hash join with repartitioning exchanges appeared to be a cheaper option. There is no query hint to force a collocated join, so we have to mess with the costing framework to produce one for our test query. Pretending that IOs cost 50 times more than usual is enough to convince the optimizer to use collocated join with our test query: -- Pretend IOs are 50x cost temporarily DBCC SETIOWEIGHT(50); -- Co-located hash join SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID OPTION (RECOMPILE); -- Reset IO costing DBCC SETIOWEIGHT(1); Collocated Join Plan The estimated execution plan for the collocated join is: The Constant Scan contains one row for each partition of the shared partitioning scheme, from 1 to 41. The hash repartitioning exchanges seen previously are replaced by a single Distribute Streams exchange using Demand partitioning. Demand partitioning means that the next partition id is given to the next parallel thread that asks for one. My test machine has eight logical processors, and all are available for SQL Server to use. As a result, there are eight threads in the single parallel branch in this plan, each processing one partition from each table at a time. Once a thread finishes processing a partition, it grabs a new partition number from the Distribute Streams exchange…and so on until all partitions have been processed. It is important to understand that the parallel scans in this plan are different from the parallel hash join plan. Although the scans have the same parallelism icon, tables T1 and T2 are not being co-operatively scanned by multiple threads in the same way. Each thread reads a single partition of T1 and performs a hash match join with the same partition from table T2. The properties of the two Clustered Index Scans show a Seek Predicate (unusual for a scan!) limiting the rows to a single partition: The crucial point is that the join between T1 and T2 is on TID, and TID is the partitioning column for both tables. A thread that processes partition ‘n’ is guaranteed to see all rows that can possibly join on TID for that partition. In addition, no other thread will see rows from that partition, so this removes the need for repartitioning exchanges. CPU and Memory Efficiency Improvements The collocated join has removed two expensive repartitioning exchanges and added a single exchange processing 41 rows (one for each partition id). Remember, the parallel hash join plan exchanges had to process 5 million and 15 million rows. The amount of processor time spent on exchanges will be much lower in the collocated join plan. In addition, the collocated join plan has a maximum of 8 threads processing single partitions at any one time. The 41 partitions will all be processed eventually, but a new partition is not started until a thread asks for it. Threads can reuse hash table memory for the new partition. The parallel hash join plan also had 8 hash tables, but with all 5,000,000 build rows loaded at the same time. The collocated plan needs memory for only 8 * 125,000 = 1,000,000 rows at any one time. Collocated Hash Join Performance The collated join plan has disappointing performance in this case. The query runs for around 25,300ms despite the same IO statistics as usual. This is much the worst result so far, so what went wrong? It turns out that cardinality estimation for the single partition scans of table T1 is slightly low. The properties of the Clustered Index Scan of T1 (graphic immediately above) show the estimation was for 121,951 rows. This is a small shortfall compared with the 125,000 rows actually encountered, but it was enough to cause the hash join to spill to physical tempdb: A level 1 spill doesn’t sound too bad, until you realize that the spill to tempdb probably occurs for each of the 41 partitions. As a side note, the cardinality estimation error is a little surprising because the system tables accurately show there are 125,000 rows in every partition of T1. Unfortunately, the optimizer uses regular column and index statistics to derive cardinality estimates here rather than system table information (e.g. sys.partitions). Collocated Merge Join We will never know how well the collocated parallel hash join plan might have worked without the cardinality estimation error (and the resulting 41 spills to tempdb) but we do know: Merge join does not require a memory grant; and Merge join was the optimizer’s preferred join option for a single partition join Putting this all together, what we would really like to see is the same collocated join strategy, but using merge join instead of hash join. Unfortunately, the current query optimizer cannot produce a collocated merge join; it only knows how to do collocated hash join. So where does this leave us? CROSS APPLY sys.partitions We can try to write our own collocated join query. We can use sys.partitions to find the partition numbers, and CROSS APPLY to get a count per partition, with a final step to sum the partial counts. The following query implements this idea: SELECT row_count = SUM(Subtotals.cnt) FROM ( -- Partition numbers SELECT p.partition_number FROM sys.partitions AS p WHERE p.[object_id] = OBJECT_ID(N'T1', N'U') AND p.index_id = 1 ) AS P CROSS APPLY ( -- Count per collocated join SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals; The estimated plan is: The cardinality estimates aren’t all that good here, especially the estimate for the scan of the system table underlying the sys.partitions view. Nevertheless, the plan shape is heading toward where we would like to be. Each partition number from the system table results in a per-partition scan of T1 and T2, a one-to-many Merge Join, and a Stream Aggregate to compute the partial counts. The final Stream Aggregate just sums the partial counts. Execution time for this query is around 3,500ms, with the same IO statistics as always. This compares favourably with 5,000ms for the serial plan produced by the optimizer with the OPTION (MERGE JOIN) hint. This is another case of the sum of the parts being less than the whole – summing 41 partial counts from 41 single-partition merge joins is faster than a single merge join and count over all partitions. Even so, this single-threaded collocated merge join is not as quick as the original parallel hash join plan, which executed in 2,600ms. On the positive side, our collocated merge join uses only one logical processor and requires no memory grant. The parallel hash join plan used 16 threads and reserved 569 MB of memory: Using a Temporary Table Our collocated merge join plan should benefit from parallelism. The reason parallelism is not being used is that the query references a system table. We can work around that by writing the partition numbers to a temporary table (or table variable): SET STATISTICS IO ON; DECLARE @s datetime2 = SYSUTCDATETIME(); CREATE TABLE #P ( partition_number integer PRIMARY KEY); INSERT #P (partition_number) SELECT p.partition_number FROM sys.partitions AS p WHERE p.[object_id] = OBJECT_ID(N'T1', N'U') AND p.index_id = 1; SELECT row_count = SUM(Subtotals.cnt) FROM #P AS p CROSS APPLY ( SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals; DROP TABLE #P; SELECT DATEDIFF(Millisecond, @s, SYSUTCDATETIME()); SET STATISTICS IO OFF; Using the temporary table adds a few logical reads, but the overall execution time is still around 3500ms, indistinguishable from the same query without the temporary table. The problem is that the query optimizer still doesn’t choose a parallel plan for this query, though the removal of the system table reference means that it could if it chose to: In fact the optimizer did enter the parallel plan phase of query optimization (running search 1 for a second time): Unfortunately, the parallel plan found seemed to be more expensive than the serial plan. This is a crazy result, caused by the optimizer’s cost model not reducing operator CPU costs on the inner side of a nested loops join. Don’t get me started on that, we’ll be here all night. In this plan, everything expensive happens on the inner side of a nested loops join. Without a CPU cost reduction to compensate for the added cost of exchange operators, candidate parallel plans always look more expensive to the optimizer than the equivalent serial plan. Parallel Collocated Merge Join We can produce the desired parallel plan using trace flag 8649 again: SELECT row_count = SUM(Subtotals.cnt) FROM #P AS p CROSS APPLY ( SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals OPTION (QUERYTRACEON 8649); The actual execution plan is: One difference between this plan and the collocated hash join plan is that a Repartition Streams exchange operator is used instead of Distribute Streams. The effect is similar, though not quite identical. The Repartition uses round-robin partitioning, meaning the next partition id is pushed to the next thread in sequence. The Distribute Streams exchange seen earlier used Demand partitioning, meaning the next partition id is pulled across the exchange by the next thread that is ready for more work. There are subtle performance implications for each partitioning option, but going into that would again take us too far off the main point of this post. Performance The important thing is the performance of this parallel collocated merge join – just 1350ms on a typical run. The list below shows all the alternatives from this post (all timings include creation, population, and deletion of the temporary table where appropriate) from quickest to slowest: Collocated parallel merge join: 1350ms Parallel hash join: 2600ms Collocated serial merge join: 3500ms Serial merge join: 5000ms Parallel merge join: 8400ms Collated parallel hash join: 25,300ms (hash spill per partition) The parallel collocated merge join requires no memory grant (aside from a paltry 1.2MB used for exchange buffers). This plan uses 16 threads at DOP 8; but 8 of those are (rather pointlessly) allocated to the parallel scan of the temporary table. These are minor concerns, but it turns out there is a way to address them if it bothers you. Parallel Collocated Merge Join with Demand Partitioning This final tweak replaces the temporary table with a hard-coded list of partition ids (dynamic SQL could be used to generate this query from sys.partitions): SELECT row_count = SUM(Subtotals.cnt) FROM ( VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10), (11),(12),(13),(14),(15),(16),(17),(18),(19),(20), (21),(22),(23),(24),(25),(26),(27),(28),(29),(30), (31),(32),(33),(34),(35),(36),(37),(38),(39),(40),(41) ) AS P (partition_number) CROSS APPLY ( SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals OPTION (QUERYTRACEON 8649); The actual execution plan is: The parallel collocated hash join plan is reproduced below for comparison: The manual rewrite has another advantage that has not been mentioned so far: the partial counts (per partition) can be computed earlier than the partial counts (per thread) in the optimizer’s collocated join plan. The earlier aggregation is performed by the extra Stream Aggregate under the nested loops join. The performance of the parallel collocated merge join is unchanged at around 1350ms. Final Words It is a shame that the current query optimizer does not consider a collocated merge join (Connect item closed as Won’t Fix). The example used in this post showed an improvement in execution time from 2600ms to 1350ms using a modestly-sized data set and limited parallelism. In addition, the memory requirement for the query was almost completely eliminated – down from 569MB to 1.2MB. The problem with the parallel hash join selected by the optimizer is that it attempts to process the full data set all at once (albeit using eight threads). It requires a large memory grant to hold all 5 million rows from table T1 across the eight hash tables, and does not take advantage of the divide-and-conquer opportunity offered by the common partitioning. The great thing about the collocated join strategies is that each parallel thread works on a single partition from both tables, reading rows, performing the join, and computing a per-partition subtotal, before moving on to a new partition. From a thread’s point of view… If you have trouble visualizing what is happening from just looking at the parallel collocated merge join execution plan, let’s look at it again, but from the point of view of just one thread operating between the two Parallelism (exchange) operators. Our thread picks up a single partition id from the Distribute Streams exchange, and starts a merge join using ordered rows from partition 1 of table T1 and partition 1 of table T2. By definition, this is all happening on a single thread. As rows join, they are added to a (per-partition) count in the Stream Aggregate immediately above the Merge Join. Eventually, either T1 (partition 1) or T2 (partition 1) runs out of rows and the merge join stops. The per-partition count from the aggregate passes on through the Nested Loops join to another Stream Aggregate, which is maintaining a per-thread subtotal. Our same thread now picks up a new partition id from the exchange (say it gets id 9 this time). The count in the per-partition aggregate is reset to zero, and the processing of partition 9 of both tables proceeds just as it did for partition 1, and on the same thread. Each thread picks up a single partition id and processes all the data for that partition, completely independently from other threads working on other partitions. One thread might eventually process partitions (1, 9, 17, 25, 33, 41) while another is concurrently processing partitions (2, 10, 18, 26, 34) and so on for the other six threads at DOP 8. The point is that all 8 threads can execute independently and concurrently, continuing to process new partitions until the wider job (of which the thread has no knowledge!) is done. This divide-and-conquer technique can be much more efficient than simply splitting the entire workload across eight threads all at once. Related Reading Understanding and Using Parallelism in SQL Server Parallel Execution Plans Suck © 2013 Paul White – All Rights Reserved Twitter: @SQL_Kiwi

Read the article
SQL Server 2012 - AlwaysOn

- by Claus Jandausch

Ich war nicht nur irritiert, ich war sogar regelrecht schockiert - und für einen kurzen Moment sprachlos (was nur selten der Fall ist). Gerade eben hatte mich jemand gefragt "Wann Oracle denn etwas Vergleichbares wie AlwaysOn bieten würde - und ob überhaupt?" War ich hier im falschen Film gelandet? Ich konnte nicht anders, als meinen Unmut kundzutun und zu erklären, dass die Fragestellung normalerweise anders herum läuft. Zugegeben - es mag vielleicht strittige Punkte geben im Vergleich zwischen Oracle und SQL Server - bei denen nicht unbedingt immer Oracle die Nase vorn haben muss - aber das Thema Clustering für Hochverfügbarkeit (HA), Disaster Recovery (DR) und Skalierbarkeit gehört mit Sicherheit nicht dazu. Dieses Erlebnis hakte ich am Nachgang als Einzelfall ab, der so nie wieder vorkommen würde. Bis ich kurz darauf eines Besseren belehrt wurde und genau die selbe Frage erneut zu hören bekam. Diesmal sogar im Exadata-Umfeld und einem Oracle Stretch Cluster. Einmal ist keinmal, doch zweimal ist einmal zu viel... Getreu diesem alten Motto war mir klar, dass man das so nicht länger stehen lassen konnte. Ich habe keine Ahnung, wie die Microsoft Marketing Abteilung es geschafft hat, unter dem AlwaysOn Brading eine innovative Technologie vermuten zu lassen - aber sie hat ihren Job scheinbar gut gemacht. Doch abgesehen von einem guten Marketing, stellt sich natürlich die Frage, was wirklich dahinter steckt und wie sich das Ganze mit Oracle vergleichen lässt - und ob überhaupt? Damit wären wir wieder bei der ursprünglichen Frage angelangt. So viel zum Hintergrund dieses Blogbeitrags - von meiner Antwort handelt der restliche Blog. "Windows was the God ..." Um den wahren Unterschied zwischen Oracle und Microsoft verstehen zu können, muss man zunächst das bedeutendste Microsoft Dogma kennen. Es lässt sich schlicht und einfach auf den Punkt bringen: "Alles muss auf Windows basieren." Die Überschrift dieses Absatzes ist kein von mir erfundener Ausspruch, sondern ein Zitat. Konkret stammt es aus einem längeren Artikel von Kurt Eichenwald in der Vanity Fair aus dem August 2012. Er lautet Microsoft's Lost Decade und sei jedem ans Herz gelegt, der die "Microsoft-Maschinerie" unter Steve Ballmer und einige ihrer Kuriositäten besser verstehen möchte. "YOU TALKING TO ME?" Microsoft C.E.O. Steve Ballmer bei seiner Keynote auf der 2012 International Consumer Electronics Show in Las Vegas am 9. Januar Manche Dinge in diesem Artikel mögen überspitzt dargestellt erscheinen - sind sie aber nicht. Vieles davon kannte ich bereits aus eigener Erfahrung und kann es nur bestätigen. Anderes hat sich mir erst so richtig erschlossen. Insbesondere die folgenden Passagen führten zum Aha-Erlebnis: “Windows was the god—everything had to work with Windows,” said Stone... “Every little thing you want to write has to build off of Windows (or other existing roducts),” one software engineer said. “It can be very confusing, …” Ich habe immer schon darauf hingewiesen, dass in einem SQL Server Failover Cluster die Microsoft Datenbank eigentlich nichts Nenneswertes zum Geschehen beiträgt, sondern sich voll und ganz auf das Windows Betriebssystem verlässt. Deshalb muss man auch die Windows Server Enterprise Edition installieren, soll ein Failover Cluster für den SQL Server eingerichtet werden. Denn hier werden die Cluster Services geliefert - nicht mit dem SQL Server. Er ist nur lediglich ein weiteres Server Produkt, für das Windows in Ausfallszenarien genutzt werden kann - so wie Microsoft Exchange beispielsweise, oder Microsoft SharePoint, oder irgendein anderes Server Produkt das auf Windows gehostet wird. Auch Oracle kann damit genutzt werden. Das Stichwort lautet hier: Oracle Failsafe. Nur - warum sollte man das tun, wenn gleichzeitig eine überlegene Technologie wie die Oracle Real Application Clusters (RAC) zur Verfügung steht, die dann auch keine Windows Enterprise Edition voraussetzen, da Oracle die eigene Clusterware liefert. Welche darüber hinaus für kürzere Failover-Zeiten sorgt, da diese Cluster-Technologie Datenbank-integriert ist und sich nicht auf "Dritte" verlässt. Wenn man sich also schon keine technischen Vorteile mit einem SQL Server Failover Cluster erkauft, sondern zusätzlich noch versteckte Lizenzkosten durch die Lizenzierung der Windows Server Enterprise Edition einhandelt, warum hat Microsoft dann in den vergangenen Jahren seit SQL Server 2000 nicht ebenfalls an einer neuen und innovativen Lösung gearbeitet, die mit Oracle RAC mithalten kann? Entwickler hat Microsoft genügend? Am Geld kann es auch nicht liegen? Lesen Sie einfach noch einmal die beiden obenstehenden Zitate und sie werden den Grund verstehen. Anders lässt es sich ja auch gar nicht mehr erklären, dass AlwaysOn aus zwei unterschiedlichen Technologien besteht, die beide jedoch wiederum auf dem Windows Server Failover Clustering (WSFC) basieren. Denn daraus ergeben sich klare Nachteile - aber dazu später mehr. Um AlwaysOn zu verstehen, sollte man sich zunächst kurz in Erinnerung rufen, was Microsoft bisher an HA/DR (High Availability/Desaster Recovery) Lösungen für SQL Server zur Verfügung gestellt hat. Replikation Basiert auf logischer Replikation und Pubisher/Subscriber Architektur Transactional Replication Merge Replication Snapshot Replication Microsoft's Replikation ist vergleichbar mit Oracle GoldenGate. Oracle GoldenGate stellt jedoch die umfassendere Technologie dar und bietet High Performance. Log Shipping Microsoft's Log Shipping stellt eine einfache Technologie dar, die vergleichbar ist mit Oracle Managed Recovery in Oracle Version 7. Das Log Shipping besitzt folgende Merkmale: Transaction Log Backups werden von Primary nach Secondary/ies geschickt Einarbeitung (z.B. Restore) auf jedem Secondary individuell Optionale dritte Server Instanz (Monitor Server) für Überwachung und Alarm Log Restore Unterbrechung möglich für Read-Only Modus (Secondary) Keine Unterstützung von Automatic Failover Database Mirroring Microsoft's Database Mirroring wurde verfügbar mit SQL Server 2005, sah aus wie Oracle Data Guard in Oracle 9i, war funktional jedoch nicht so umfassend. Für ein HA/DR Paar besteht eine 1:1 Beziehung, um die produktive Datenbank (Principle DB) abzusichern. Auf der Standby Datenbank (Mirrored DB) werden alle Insert-, Update- und Delete-Operationen nachgezogen. Modi Synchron (High-Safety Modus) Asynchron (High-Performance Modus) Automatic Failover Unterstützt im High-Safety Modus (synchron) Witness Server vorausgesetzt Zur Frage der Kontinuität Es stellt sich die Frage, wie es um diesen Technologien nun im Zusammenhang mit SQL Server 2012 bestellt ist. Unter Fanfaren seinerzeit eingeführt, war Database Mirroring das erklärte Mittel der Wahl. Ich bin kein Produkt Manager bei Microsoft und kann hierzu nur meine Meinung äußern, aber zieht man den SQL AlwaysOn Team Blog heran, so sieht es nicht gut aus für das Database Mirroring - zumindest nicht langfristig. "Does AlwaysOn Availability Group replace Database Mirroring going forward?” “The short answer is we recommend that you migrate from the mirroring configuration or even mirroring and log shipping configuration to using Availability Group. Database Mirroring will still be available in the Denali release but will be phased out over subsequent releases. Log Shipping will continue to be available in future releases.” Damit wären wir endlich beim eigentlichen Thema angelangt. Was ist eine sogenannte Availability Group und was genau hat es mit der vielversprechend klingenden Bezeichnung AlwaysOn auf sich? SQL Server 2012 - AlwaysOn Zwei HA-Features verstekcne sich hinter dem “AlwaysOn”-Branding. Einmal das AlwaysOn Failover Clustering aka SQL Server Failover Cluster Instances (FCI) - zum Anderen die AlwaysOn Availability Groups. Failover Cluster Instances (FCI) Entspricht ungefähr dem Stretch Cluster Konzept von Oracle Setzt auf Windows Server Failover Clustering (WSFC) auf Bietet HA auf Instanz-Ebene AlwaysOn Availability Groups (Verfügbarkeitsgruppen) Ähnlich der Idee von Consistency Groups, wie in Storage-Level Replikations-Software von z.B. EMC SRDF Abhängigkeiten zu Windows Server Failover Clustering (WSFC) Bietet HA auf Datenbank-Ebene Hinweis: Verwechseln Sie nicht eine SQL Server Datenbank mit einer Oracle Datenbank. Und auch nicht eine Oracle Instanz mit einer SQL Server Instanz. Die gleichen Begriffe haben hier eine andere Bedeutung - nicht selten ein Grund, weshalb Oracle- und Microsoft DBAs schnell aneinander vorbei reden. Denken Sie bei einer SQL Server Datenbank eher an ein Oracle Schema, das kommt der Sache näher. So etwas wie die SQL Server Northwind Datenbank ist vergleichbar mit dem Oracle Scott Schema. Wenn Sie die genauen Unterschiede kennen möchten, finden Sie eine detaillierte Beschreibung in meinem Buch "Oracle10g Release 2 für Windows und .NET", erhältich bei Lehmanns, Amazon, etc. Windows Server Failover Clustering (WSFC) Wie man sieht, basieren beide AlwaysOn Technologien wiederum auf dem Windows Server Failover Clustering (WSFC), um einerseits Hochverfügbarkeit auf Ebene der Instanz zu gewährleisten und andererseits auf der Datenbank-Ebene. Deshalb nun eine kurze Beschreibung der WSFC. Die WSFC sind ein mit dem Windows Betriebssystem geliefertes Infrastruktur-Feature, um HA für Server Anwendungen, wie Microsoft Exchange, SharePoint, SQL Server, etc. zu bieten. So wie jeder andere Cluster, besteht ein WSFC Cluster aus einer Gruppe unabhängiger Server, die zusammenarbeiten, um die Verfügbarkeit einer Applikation oder eines Service zu erhöhen. Falls ein Cluster-Knoten oder -Service ausfällt, kann der auf diesem Knoten bisher gehostete Service automatisch oder manuell auf einen anderen im Cluster verfügbaren Knoten transferriert werden - was allgemein als Failover bekannt ist. Unter SQL Server 2012 verwenden sowohl die AlwaysOn Avalability Groups, als auch die AlwaysOn Failover Cluster Instances die WSFC als Plattformtechnologie, um Komponenten als WSFC Cluster-Ressourcen zu registrieren. Verwandte Ressourcen werden in eine Ressource Group zusammengefasst, die in Abhängigkeit zu anderen WSFC Cluster-Ressourcen gebracht werden kann. Der WSFC Cluster Service kann jetzt die Notwendigkeit zum Neustart der SQL Server Instanz erfassen oder einen automatischen Failover zu einem anderen Server-Knoten im WSFC Cluster auslösen. Failover Cluster Instances (FCI) Eine SQL Server Failover Cluster Instanz (FCI) ist eine einzelne SQL Server Instanz, die in einem Failover Cluster betrieben wird, der aus mehreren Windows Server Failover Clustering (WSFC) Knoten besteht und so HA (High Availability) auf Ebene der Instanz bietet. Unter Verwendung von Multi-Subnet FCI kann auch Remote DR (Disaster Recovery) unterstützt werden. Eine weitere Option für Remote DR besteht darin, eine unter FCI gehostete Datenbank in einer Availability Group zu betreiben. Hierzu später mehr. FCI und WSFC Basis FCI, das für lokale Hochverfügbarkeit der Instanzen genutzt wird, ähnelt der veralteten Architektur eines kalten Cluster (Aktiv-Passiv). Unter SQL Server 2008 wurde diese Technologie SQL Server 2008 Failover Clustering genannt. Sie nutzte den Windows Server Failover Cluster. In SQL Server 2012 hat Microsoft diese Basistechnologie unter der Bezeichnung AlwaysOn zusammengefasst. Es handelt sich aber nach wie vor um die klassische Aktiv-Passiv-Konfiguration. Der Ablauf im Failover-Fall ist wie folgt: Solange kein Hardware-oder System-Fehler auftritt, werden alle Dirty Pages im Buffer Cache auf Platte geschrieben Alle entsprechenden SQL Server Services (Dienste) in der Ressource Gruppe werden auf dem aktiven Knoten gestoppt Die Ownership der Ressource Gruppe wird auf einen anderen Knoten der FCI transferriert Der neue Owner (Besitzer) der Ressource Gruppe startet seine SQL Server Services (Dienste) Die Connection-Anforderungen einer Client-Applikation werden automatisch auf den neuen aktiven Knoten mit dem selben Virtuellen Network Namen (VNN) umgeleitet Abhängig vom Zeitpunkt des letzten Checkpoints, kann die Anzahl der Dirty Pages im Buffer Cache, die noch auf Platte geschrieben werden müssen, zu unvorhersehbar langen Failover-Zeiten führen. Um diese Anzahl zu drosseln, besitzt der SQL Server 2012 eine neue Fähigkeit, die Indirect Checkpoints genannt wird. Indirect Checkpoints ähnelt dem Fast-Start MTTR Target Feature der Oracle Datenbank, das bereits mit Oracle9i verfügbar war. SQL Server Multi-Subnet Clustering Ein SQL Server Multi-Subnet Failover Cluster entspricht vom Konzept her einem Oracle RAC Stretch Cluster. Doch dies ist nur auf den ersten Blick der Fall. Im Gegensatz zu RAC ist in einem lokalen SQL Server Failover Cluster jeweils nur ein Knoten aktiv für eine Datenbank. Für die Datenreplikation zwischen geografisch entfernten Sites verlässt sich Microsoft auf 3rd Party Lösungen für das Storage Mirroring. Die Verbesserung dieses Szenario mit einer SQL Server 2012 Implementierung besteht schlicht darin, dass eine VLAN-Konfiguration (Virtual Local Area Network) nun nicht mehr benötigt wird, so wie dies bisher der Fall war. Das folgende Diagramm stellt dar, wie der Ablauf mit SQL Server 2012 gehandhabt wird. In Site A und Site B wird HA jeweils durch einen lokalen Aktiv-Passiv-Cluster sichergestellt. Besondere Aufmerksamkeit muss hier der Konfiguration und dem Tuning geschenkt werden, da ansonsten völlig inakzeptable Failover-Zeiten resultieren. Dies liegt darin begründet, weil die Downtime auf Client-Seite nun nicht mehr nur von der reinen Failover-Zeit abhängt, sondern zusätzlich von der Dauer der DNS Replikation zwischen den DNS Servern. (Rufen Sie sich in Erinnerung, dass wir gerade von Multi-Subnet Clustering sprechen). Außerdem ist zu berücksichtigen, wie schnell die Clients die aktualisierten DNS Informationen abfragen. Spezielle Konfigurationen für Node Heartbeat, HostRecordTTL (Host Record Time-to-Live) und Intersite Replication Frequeny für Active Directory Sites und Services werden notwendig. Default TTL für Windows Server 2008 R2: 20 Minuten Empfohlene Einstellung: 1 Minute DNS Update Replication Frequency in Windows Umgebung: 180 Minuten Empfohlene Einstellung: 15 Minuten (minimaler Wert) Betrachtet man diese Werte, muss man feststellen, dass selbst eine optimale Konfiguration die rigiden SLAs (Service Level Agreements) heutiger geschäftskritischer Anwendungen für HA und DR nicht erfüllen kann. Denn dies impliziert eine auf der Client-Seite erlebte Failover-Zeit von insgesamt 16 Minuten. Hierzu ein Auszug aus der SQL Server 2012 Online Dokumentation: Cons: If a cross-subnet failover occurs, the client recovery time could be 15 minutes or longer, depending on your HostRecordTTL setting and the setting of your cross-site DNS/AD replication schedule. Wir sind hier an einem Punkt unserer Überlegungen angelangt, an dem sich erklärt, weshalb ich zuvor das "Windows was the God ..." Zitat verwendet habe. Die unbedingte Abhängigkeit zu Windows wird zunehmend zum Problem, da sie die Komplexität einer Microsoft-basierenden Lösung erhöht, anstelle sie zu reduzieren. Und Komplexität ist das Letzte, was sich CIOs heutzutage wünschen. Zur Ehrenrettung des SQL Server 2012 und AlwaysOn muss man sagen, dass derart lange Failover-Zeiten kein unbedingtes "Muss" darstellen, sondern ein "Kann". Doch auch ein "Kann" kann im unpassenden Moment unvorhersehbare und kostspielige Folgen haben. Die Unabsehbarkeit ist wiederum Ursache vieler an der Implementierung beteiligten Komponenten und deren Abhängigkeiten, wie beispielsweise drei Cluster-Lösungen (zwei von Microsoft, eine 3rd Party Lösung). Wie man die Sache auch dreht und wendet, kommt man an diesem Fakt also nicht vorbei - ganz unabhängig von der Dauer einer Downtime oder Failover-Zeiten. Im Gegensatz zu AlwaysOn und der hier vorgestellten Version eines Stretch-Clusters, vermeidet eine entsprechende Oracle Implementierung eine derartige Komplexität, hervorgerufen duch multiple Abhängigkeiten. Den Unterschied machen Datenbank-integrierte Mechanismen, wie Fast Application Notification (FAN) und Fast Connection Failover (FCF). Für Oracle MAA Konfigurationen (Maximum Availability Architecture) sind Inter-Site Failover-Zeiten im Bereich von Sekunden keine Seltenheit. Wenn Sie dem Link zur Oracle MAA folgen, finden Sie außerdem eine Reihe an Customer Case Studies. Auch dies ist ein wichtiges Unterscheidungsmerkmal zu AlwaysOn, denn die Oracle Technologie hat sich bereits zigfach in höchst kritischen Umgebungen bewährt. Availability Groups (Verfügbarkeitsgruppen) Die sogenannten Availability Groups (Verfügbarkeitsgruppen) sind - neben FCI - der weitere Baustein von AlwaysOn. Hinweis: Bevor wir uns näher damit beschäftigen, sollten Sie sich noch einmal ins Gedächtnis rufen, dass eine SQL Server Datenbank nicht die gleiche Bedeutung besitzt, wie eine Oracle Datenbank, sondern eher einem Oracle Schema entspricht. So etwas wie die SQL Server Northwind Datenbank ist vergleichbar mit dem Oracle Scott Schema. Eine Verfügbarkeitsgruppe setzt sich zusammen aus einem Set mehrerer Benutzer-Datenbanken, die im Falle eines Failover gemeinsam als Gruppe behandelt werden. Eine Verfügbarkeitsgruppe unterstützt ein Set an primären Datenbanken (primäres Replikat) und einem bis vier Sets von entsprechenden sekundären Datenbanken (sekundäre Replikate). Es können jedoch nicht alle SQL Server Datenbanken einer AlwaysOn Verfügbarkeitsgruppe zugeordnet werden. Der SQL Server Spezialist Michael Otey zählt in seinem SQL Server Pro Artikel folgende Anforderungen auf: Verfügbarkeitsgruppen müssen mit Benutzer-Datenbanken erstellt werden. System-Datenbanken können nicht verwendet werden Die Datenbanken müssen sich im Read-Write Modus befinden. Read-Only Datenbanken werden nicht unterstützt Die Datenbanken in einer Verfügbarkeitsgruppe müssen Multiuser Datenbanken sein Sie dürfen nicht das AUTO_CLOSE Feature verwenden Sie müssen das Full Recovery Modell nutzen und es muss ein vollständiges Backup vorhanden sein Eine gegebene Datenbank kann sich nur in einer einzigen Verfügbarkeitsgruppe befinden und diese Datenbank düerfen nicht für Database Mirroring konfiguriert sein Microsoft empfiehl außerdem, dass der Verzeichnispfad einer Datenbank auf dem primären und sekundären Server identisch sein sollte Wie man sieht, eignen sich Verfügbarkeitsgruppen nicht, um HA und DR vollständig abzubilden. Die Unterscheidung zwischen der Instanzen-Ebene (FCI) und Datenbank-Ebene (Availability Groups) ist von hoher Bedeutung. Vor kurzem wurde mir gesagt, dass man mit den Verfügbarkeitsgruppen auf Shared Storage verzichten könne und dadurch Kosten spart. So weit so gut ... Man kann natürlich eine Installation rein mit Verfügbarkeitsgruppen und ohne FCI durchführen - aber man sollte sich dann darüber bewusst sein, was man dadurch alles nicht abgesichert hat - und dies wiederum für Desaster Recovery (DR) und SLAs (Service Level Agreements) bedeutet. Kurzum, um die Kombination aus beiden AlwaysOn Produkten und der damit verbundene Komplexität kommt man wohl in der Praxis nicht herum. Availability Groups und WSFC AlwaysOn hängt von Windows Server Failover Clustering (WSFC) ab, um die aktuellen Rollen der Verfügbarkeitsreplikate einer Verfügbarkeitsgruppe zu überwachen und zu verwalten, und darüber zu entscheiden, wie ein Failover-Ereignis die Verfügbarkeitsreplikate betrifft. Das folgende Diagramm zeigt de Beziehung zwischen Verfügbarkeitsgruppen und WSFC: Der Verfügbarkeitsmodus ist eine Eigenschaft jedes Verfügbarkeitsreplikats. Synychron und Asynchron können also gemischt werden: Availability Modus (Verfügbarkeitsmodus) Asynchroner Commit-Modus Primäres replikat schließt Transaktionen ohne Warten auf Sekundäres Synchroner Commit-Modus Primäres Replikat wartet auf Commit von sekundärem Replikat Failover Typen Automatic Manual Forced (mit möglichem Datenverlust) Synchroner Commit-Modus Geplanter, manueller Failover ohne Datenverlust Automatischer Failover ohne Datenverlust Asynchroner Commit-Modus Nur Forced, manueller Failover mit möglichem Datenverlust Der SQL Server kennt keinen separaten Switchover Begriff wie in Oracle Data Guard. Für SQL Server werden alle Role Transitions als Failover bezeichnet. Tatsächlich unterstützt der SQL Server keinen Switchover für asynchrone Verbindungen. Es gibt nur die Form des Forced Failover mit möglichem Datenverlust. Eine ähnliche Fähigkeit wie der Switchover unter Oracle Data Guard ist so nicht gegeben. SQL Sever FCI mit Availability Groups (Verfügbarkeitsgruppen) Neben den Verfügbarkeitsgruppen kann eine zweite Failover-Ebene eingerichtet werden, indem SQL Server FCI (auf Shared Storage) mit WSFC implementiert wird. Ein Verfügbarkeitesreplikat kann dann auf einer Standalone Instanz gehostet werden, oder einer FCI Instanz. Zum Verständnis: Die Verfügbarkeitsgruppen selbst benötigen kein Shared Storage. Diese Kombination kann verwendet werden für lokale HA auf Ebene der Instanz und DR auf Datenbank-Ebene durch Verfügbarkeitsgruppen. Das folgende Diagramm zeigt dieses Szenario: Achtung! Hier handelt es sich nicht um ein Pendant zu Oracle RAC plus Data Guard, auch wenn das Bild diesen Eindruck vielleicht vermitteln mag - denn alle sekundären Knoten im FCI sind rein passiv. Es existiert außerdem eine weitere und ernsthafte Einschränkung: SQL Server Failover Cluster Instanzen (FCI) unterstützen nicht das automatische AlwaysOn Failover für Verfügbarkeitsgruppen. Jedes unter FCI gehostete Verfügbarkeitsreplikat kann nur für manuelles Failover konfiguriert werden. Lesbare Sekundäre Replikate Ein oder mehrere Verfügbarkeitsreplikate in einer Verfügbarkeitsgruppe können für den lesenden Zugriff konfiguriert werden, wenn sie als sekundäres Replikat laufen. Dies ähnelt Oracle Active Data Guard, jedoch gibt es Einschränkungen. Alle Abfragen gegen die sekundäre Datenbank werden automatisch auf das Snapshot Isolation Level abgebildet. Es handelt sich dabei um eine Versionierung der Rows. Microsoft versuchte hiermit die Oracle MVRC (Multi Version Read Consistency) nachzustellen. Tatsächlich muss man die SQL Server Snapshot Isolation eher mit Oracle Flashback vergleichen. Bei der Implementierung des Snapshot Isolation Levels handelt sich um ein nachträglich aufgesetztes Feature und nicht um einen inhärenten Teil des Datenbank-Kernels, wie im Falle Oracle. (Ich werde hierzu in Kürze einen weiteren Blogbeitrag verfassen, wenn ich mich mit der neuen SQL Server 2012 Core Lizenzierung beschäftige.) Für die Praxis entstehen aus der Abbildung auf das Snapshot Isolation Level ernsthafte Restriktionen, derer man sich für den Betrieb in der Praxis bereits vorab bewusst sein sollte: Sollte auf der primären Datenbank eine aktive Transaktion zu dem Zeitpunkt existieren, wenn ein lesbares sekundäres Replikat in die Verfügbarkeitsgruppe aufgenommen wird, werden die Row-Versionen auf der korrespondierenden sekundären Datenbank nicht sofort vollständig verfügbar sein. Eine aktive Transaktion auf dem primären Replikat muss zuerst abgeschlossen (Commit oder Rollback) und dieser Transaktions-Record auf dem sekundären Replikat verarbeitet werden. Bis dahin ist das Isolation Level Mapping auf der sekundären Datenbank unvollständig und Abfragen sind temporär geblockt. Microsoft sagt dazu: "This is needed to guarantee that row versions are available on the secondary replica before executing the query under snapshot isolation as all isolation levels are implicitly mapped to snapshot isolation." (SQL Storage Engine Blog: AlwaysOn: I just enabled Readable Secondary but my query is blocked?) Grundlegend bedeutet dies, dass ein aktives lesbares Replikat nicht in die Verfügbarkeitsgruppe aufgenommen werden kann, ohne das primäre Replikat vorübergehend stillzulegen. Da Leseoperationen auf das Snapshot Isolation Transaction Level abgebildet werden, kann die Bereinigung von Ghost Records auf dem primären Replikat durch Transaktionen auf einem oder mehreren sekundären Replikaten geblockt werden - z.B. durch eine lang laufende Abfrage auf dem sekundären Replikat. Diese Bereinigung wird auch blockiert, wenn die Verbindung zum sekundären Replikat abbricht oder der Datenaustausch unterbrochen wird. Auch die Log Truncation wird in diesem Zustant verhindert. Wenn dieser Zustand längere Zeit anhält, empfiehlt Microsoft das sekundäre Replikat aus der Verfügbarkeitsgruppe herauszunehmen - was ein ernsthaftes Downtime-Problem darstellt. Die Read-Only Workload auf den sekundären Replikaten kann eingehende DDL Änderungen blockieren. Obwohl die Leseoperationen aufgrund der Row-Versionierung keine Shared Locks halten, führen diese Operatioen zu Sch-S Locks (Schemastabilitätssperren). DDL-Änderungen durch Redo-Operationen können dadurch blockiert werden. Falls DDL aufgrund konkurrierender Lese-Workload blockiert wird und der Schwellenwert für 'Recovery Interval' (eine SQL Server Konfigurationsoption) überschritten wird, generiert der SQL Server das Ereignis sqlserver.lock_redo_blocked, welches Microsoft zum Kill der blockierenden Leser empfiehlt. Auf die Verfügbarkeit der Anwendung wird hierbei keinerlei Rücksicht genommen. Keine dieser Einschränkungen existiert mit Oracle Active Data Guard. Backups auf sekundären Replikaten Über die sekundären Replikate können Backups (BACKUP DATABASE via Transact-SQL) nur als copy-only Backups einer vollständigen Datenbank, Dateien und Dateigruppen erstellt werden. Das Erstellen inkrementeller Backups ist nicht unterstützt, was ein ernsthafter Rückstand ist gegenüber der Backup-Unterstützung physikalischer Standbys unter Oracle Data Guard. Hinweis: Ein möglicher Workaround via Snapshots, bleibt ein Workaround. Eine weitere Einschränkung dieses Features gegenüber Oracle Data Guard besteht darin, dass das Backup eines sekundären Replikats nicht ausgeführt werden kann, wenn es nicht mit dem primären Replikat kommunizieren kann. Darüber hinaus muss das sekundäre Replikat synchronisiert sein oder sich in der Synchronisation befinden, um das Beackup auf dem sekundären Replikat erstellen zu können. Vergleich von Microsoft AlwaysOn mit der Oracle MAA Ich komme wieder zurück auf die Eingangs erwähnte, mehrfach an mich gestellte Frage "Wann denn - und ob überhaupt - Oracle etwas Vergleichbares wie AlwaysOn bieten würde?" und meine damit verbundene (kurze) Irritation. Wenn Sie diesen Blogbeitrag bis hierher gelesen haben, dann kennen Sie jetzt meine darauf gegebene Antwort. Der eine oder andere Punkt traf dabei nicht immer auf Jeden zu, was auch nicht der tiefere Sinn und Zweck meiner Antwort war. Wenn beispielsweise kein Multi-Subnet mit im Spiel ist, sind alle diesbezüglichen Kritikpunkte zunächst obsolet. Was aber nicht bedeutet, dass sie nicht bereits morgen schon wieder zum Thema werden könnten (Sag niemals "Nie"). In manch anderes Fettnäpfchen tritt man wiederum nicht unbedingt in einer Testumgebung, sondern erst im laufenden Betrieb. Erst recht nicht dann, wenn man sich potenzieller Probleme nicht bewusst ist und keine dedizierten Tests startet. Und wer AlwaysOn erfolgreich positionieren möchte, wird auch gar kein Interesse daran haben, auf mögliche Schwachstellen und den besagten Teufel im Detail aufmerksam zu machen. Das ist keine Unterstellung - es ist nur menschlich. Außerdem ist es verständlich, dass man sich in erster Linie darauf konzentriert "was geht" und "was gut läuft", anstelle auf das "was zu Problemen führen kann" oder "nicht funktioniert". Wer will schon der Miesepeter sein? Für mich selbst gesprochen, kann ich nur sagen, dass ich lieber vorab von allen möglichen Einschränkungen wissen möchte, anstelle sie dann nach einer kurzen Zeit der heilen Welt schmerzhaft am eigenen Leib erfahren zu müssen. Ich bin davon überzeugt, dass es Ihnen nicht anders geht. Nachfolgend deshalb eine Zusammenfassung all jener Punkte, die ich im Vergleich zur Oracle MAA (Maximum Availability Architecture) als unbedingt Erwähnenswert betrachte, falls man eine Evaluierung von Microsoft AlwaysOn in Betracht zieht. 1. AlwaysOn ist eine komplexe Technologie Der SQL Server AlwaysOn Stack ist zusammengesetzt aus drei verschiedenen Technlogien: Windows Server Failover Clustering (WSFC) SQL Server Failover Cluster Instances (FCI) SQL Server Availability Groups (Verfügbarkeitsgruppen) Man kann eine derartige Lösung nicht als nahtlos bezeichnen, wofür auch die vielen von Microsoft dargestellten Einschränkungen sprechen. Während sich frühere SQL Server Versionen in Richtung eigener HA/DR Technologien entwickelten (wie Database Mirroring), empfiehlt Microsoft nun die Migration. Doch weshalb dieser Schwenk? Er führt nicht zu einem konsisten und robusten Angebot an HA/DR Technologie für geschäftskritische Umgebungen. Liegt die Antwort in meiner These begründet, nach der "Windows was the God ..." noch immer gilt und man die Nachteile der allzu engen Kopplung mit Windows nicht sehen möchte? Entscheiden Sie selbst ... 2. Failover Cluster Instanzen - Kein RAC-Pendant Die SQL Server und Windows Server Clustering Technologie basiert noch immer auf dem veralteten Aktiv-Passiv Modell und führt zu einer Verschwendung von Systemressourcen. In einer Betrachtung von lediglich zwei Knoten erschließt sich auf Anhieb noch nicht der volle Mehrwert eines Aktiv-Aktiv Clusters (wie den Real Application Clusters), wie er von Oracle bereits vor zehn Jahren entwickelt wurde. Doch kennt man die Vorzüge der Skalierbarkeit durch einfaches Hinzufügen weiterer Cluster-Knoten, die dann alle gemeinsam als ein einziges logisches System zusammenarbeiten, versteht man was hinter dem Motto "Pay-as-you-Grow" steckt. In einem Aktiv-Aktiv Cluster geht es zwar auch um Hochverfügbarkeit - und ein Failover erfolgt zudem schneller, als in einem Aktiv-Passiv Modell - aber es geht eben nicht nur darum. An dieser Stelle sei darauf hingewiesen, dass die Oracle 11g Standard Edition bereits die Nutzung von Oracle RAC bis zu vier Sockets kostenfrei beinhaltet. Möchten Sie dazu Windows nutzen, benötigen Sie keine Windows Server Enterprise Edition, da Oracle 11g die eigene Clusterware liefert. Sie kommen in den Genuss von Hochverfügbarkeit und Skalierbarkeit und können dazu die günstigere Windows Server Standard Edition nutzen. 3. SQL Server Multi-Subnet Clustering - Abhängigkeit zu 3rd Party Storage Mirroring Die SQL Server Multi-Subnet Clustering Architektur unterstützt den Aufbau eines Stretch Clusters, basiert dabei aber auf dem Aktiv-Passiv Modell. Das eigentlich Problematische ist jedoch, dass man sich zur Absicherung der Datenbank auf 3rd Party Storage Mirroring Technologie verlässt, ohne Integration zwischen dem Windows Server Failover Clustering (WSFC) und der darunterliegenden Mirroring Technologie. Wenn nun im Cluster ein Failover auf Instanzen-Ebene erfolgt, existiert keine Koordination mit einem möglichen Failover auf Ebene des Storage-Array. 4. Availability Groups (Verfügbarkeitsgruppen) - Vier, oder doch nur Zwei? Ein primäres Replikat erlaubt bis zu vier sekundäre Replikate innerhalb einer Verfügbarkeitsgruppe, jedoch nur zwei im Synchronen Commit Modus. Während dies zwar einen Vorteil gegenüber dem stringenten 1:1 Modell unter Database Mirroring darstellt, fällt der SQL Server 2012 damit immer noch weiter zurück hinter Oracle Data Guard mit bis zu 30 direkten Stanbdy Zielen - und vielen weiteren durch kaskadierende Ziele möglichen. Damit eignet sich Oracle Active Data Guard auch für die Bereitstellung einer Reader-Farm Skalierbarkeit für Internet-basierende Unternehmen. Mit AwaysOn Verfügbarkeitsgruppen ist dies nicht möglich. 5. Availability Groups (Verfügbarkeitsgruppen) - kein asynchrones Switchover Die Technologie der Verfügbarkeitsgruppen wird auch als geeignetes Mittel für administrative Aufgaben positioniert - wie Upgrades oder Wartungsarbeiten. Man muss sich jedoch einem gravierendem Defizit bewusst sein: Im asynchronen Verfügbarkeitsmodus besteht die einzige Möglichkeit für Role Transition im Forced Failover mit Datenverlust! Um den Verlust von Daten durch geplante Wartungsarbeiten zu vermeiden, muss man den synchronen Verfügbarkeitsmodus konfigurieren, was jedoch ernstzunehmende Auswirkungen auf WAN Deployments nach sich zieht. Spinnt man diesen Gedanken zu Ende, kommt man zu dem Schluss, dass die Technologie der Verfügbarkeitsgruppen für geplante Wartungsarbeiten in einem derartigen Umfeld nicht effektiv genutzt werden kann. 6. Automatisches Failover - Nicht immer möglich Sowohl die SQL Server FCI, als auch Verfügbarkeitsgruppen unterstützen automatisches Failover. Möchte man diese jedoch kombinieren, wird das Ergebnis kein automatisches Failover sein. Denn ihr Zusammentreffen im Failover-Fall führt zu Race Conditions (Wettlaufsituationen), weshalb diese Konfiguration nicht länger das automatische Failover zu einem Replikat in einer Verfügbarkeitsgruppe erlaubt. Auch hier bestätigt sich wieder die tiefere Problematik von AlwaysOn, mit einer Zusammensetzung aus unterschiedlichen Technologien und der Abhängigkeit zu Windows. 7. Problematische RTO (Recovery Time Objective) Microsoft postioniert die SQL Server Multi-Subnet Clustering Architektur als brauchbare HA/DR Architektur. Bedenkt man jedoch die Problematik im Zusammenhang mit DNS Replikation und den möglichen langen Wartezeiten auf Client-Seite von bis zu 16 Minuten, sind strenge RTO Anforderungen (Recovery Time Objectives) nicht erfüllbar. Im Gegensatz zu Oracle besitzt der SQL Server keine Datenbank-integrierten Technologien, wie Oracle Fast Application Notification (FAN) oder Oracle Fast Connection Failover (FCF). 8. Problematische RPO (Recovery Point Objective) SQL Server ermöglicht Forced Failover (erzwungenes Failover), bietet jedoch keine Möglichkeit zur automatischen Übertragung der letzten Datenbits von einem alten zu einem neuen primären Replikat, wenn der Verfügbarkeitsmodus asynchron war. Oracle Data Guard hingegen bietet diese Unterstützung durch das Flush Redo Feature. Dies sichert "Zero Data Loss" und beste RPO auch in erzwungenen Failover-Situationen. 9. Lesbare Sekundäre Replikate mit Einschränkungen Aufgrund des Snapshot Isolation Transaction Level für lesbare sekundäre Replikate, besitzen diese Einschränkungen mit Auswirkung auf die primäre Datenbank. Die Bereinigung von Ghost Records auf der primären Datenbank, wird beeinflusst von lang laufenden Abfragen auf der lesabaren sekundären Datenbank. Die lesbare sekundäre Datenbank kann nicht in die Verfügbarkeitsgruppe aufgenommen werden, wenn es aktive Transaktionen auf der primären Datenbank gibt. Zusätzlich können DLL Änderungen auf der primären Datenbank durch Abfragen auf der sekundären blockiert werden. Und imkrementelle Backups werden hier nicht unterstützt. Keine dieser Restriktionen existiert unter Oracle Data Guard.

Read the article
The Incremental Architect´s Napkin – #3 – Make Evolvability inevitable

- by Ralf Westphal

Originally posted on: http://geekswithblogs.net/theArchitectsNapkin/archive/2014/06/04/the-incremental-architectacutes-napkin-ndash-3-ndash-make-evolvability-inevitable.aspxThe easier something to measure the more likely it will be produced. Deviations between what is and what should be can be readily detected. That´s what automated acceptance tests are for. That´s what sprint reviews in Scrum are for. It´s no small wonder our software looks like it looks. It has all the traits whose conformance with requirements can easily be measured. And it´s lacking traits which cannot easily be measured. Evolvability (or Changeability) is such a trait. If an operation is correct, if an operation if fast enough, that can be checked very easily. But whether Evolvability is high or low, that cannot be checked by taking a measure or two. Evolvability might correlate with certain traits, e.g. number of lines of code (LOC) per function or Cyclomatic Complexity or test coverage. But there is no threshold value signalling “evolvability too low”; also Evolvability is hardly tangible for the customer. Nevertheless Evolvability is of great importance - at least in the long run. You can get away without much of it for a short time. Eventually, though, it´s needed like any other requirement. Or even more. Because without Evolvability no other requirement can be implemented. Evolvability is the foundation on which all else is build. Such fundamental importance is in stark contrast with its immeasurability. To compensate this, Evolvability must be put at the very center of software development. It must become the hub around everything else revolves. Since we cannot measure Evolvability, though, we cannot start watching it more. Instead we need to establish practices to keep it high (enough) at all times. Chefs have known that for long. That´s why everybody in a restaurant kitchen is constantly seeing after cleanliness. Hygiene is important as is to have clean tools at standardized locations. Only then the health of the patrons can be guaranteed and production efficiency is constantly high. Still a kitchen´s level of cleanliness is easier to measure than software Evolvability. That´s why important practices like reviews, pair programming, or TDD are not enough, I guess. What we need to keep Evolvability in focus and high is… to continually evolve. Change must not be something to avoid but too embrace. To me that means the whole change cycle from requirement analysis to delivery needs to be gone through more often. Scrum´s sprints of 4, 2 even 1 week are too long. Kanban´s flow of user stories across is too unreliable; it takes as long as it takes. Instead we should fix the cycle time at 2 days max. I call that Spinning. No increment must take longer than from this morning until tomorrow evening to finish. Then it should be acceptance checked by the customer (or his/her representative, e.g. a Product Owner). For me there are several resasons for such a fixed and short cycle time for each increment: Clear expectations Absolute estimates (“This will take X days to complete.”) are near impossible in software development as explained previously. Too much unplanned research and engineering work lurk in every feature. And then pervasive interruptions of work by peers and management. However, the smaller the scope the better our absolute estimates become. That´s because we understand better what really are the requirements and what the solution should look like. But maybe more importantly the shorter the timespan the more we can control how we use our time. So much can happen over the course of a week and longer timespans. But if push comes to shove I can block out all distractions and interruptions for a day or possibly two. That´s why I believe we can give rough absolute estimates on 3 levels: Noon Tonight Tomorrow Think of a meeting with a Product Owner at 8:30 in the morning. If she asks you, how long it will take you to implement a user story or bug fix, you can say, “It´ll be fixed by noon.”, or you can say, “I can manage to implement it until tonight before I leave.”, or you can say, “You´ll get it by tomorrow night at latest.” Yes, I believe all else would be naive. If you´re not confident to get something done by tomorrow night (some 34h from now) you just cannot reliably commit to any timeframe. That means you should not promise anything, you should not even start working on the issue. So when estimating use these four categories: Noon, Tonight, Tomorrow, NoClue - with NoClue meaning the requirement needs to be broken down further so each aspect can be assigned to one of the first three categories. If you like absolute estimates, here you go. But don´t do deep estimates. Don´t estimate dozens of issues; don´t think ahead (“Issue A is a Tonight, then B will be a Tomorrow, after that it´s C as a Noon, finally D is a Tonight - that´s what I´ll do this week.”). Just estimate so Work-in-Progress (WIP) is 1 for everybody - plus a small number of buffer issues. To be blunt: Yes, this makes promises impossible as to what a team will deliver in terms of scope at a certain date in the future. But it will give a Product Owner a clear picture of what to pull for acceptance feedback tonight and tomorrow. Trust through reliability Our trade is lacking trust. Customers don´t trust software companies/departments much. Managers don´t trust developers much. I find that perfectly understandable in the light of what we´re trying to accomplish: delivering software in the face of uncertainty by means of material good production. Customers as well as managers still expect software development to be close to production of houses or cars. But that´s a fundamental misunderstanding. Software development ist development. It´s basically research. As software developers we´re constantly executing experiments to find out what really provides value to users. We don´t know what they need, we just have mediated hypothesises. That´s why we cannot reliably deliver on preposterous demands. So trust is out of the window in no time. If we switch to delivering in short cycles, though, we can regain trust. Because estimates - explicit or implicit - up to 32 hours at most can be satisfied. I´d say: reliability over scope. It´s more important to reliably deliver what was promised then to cover a lot of requirement area. So when in doubt promise less - but deliver without delay. Deliver on scope (Functionality and Quality); but also deliver on Evolvability, i.e. on inner quality according to accepted principles. Always. Trust will be the reward. Less complexity of communication will follow. More goodwill buffer will follow. So don´t wait for some Kanban board to show you, that flow can be improved by scheduling smaller stories. You don´t need to learn that the hard way. Just start with small batch sizes of three different sizes. Fast feedback What has been finished can be checked for acceptance. Why wait for a sprint of several weeks to end? Why let the mental model of the issue and its solution dissipate? If you get final feedback after one or two weeks, you hardly remember what you did and why you did it. Resoning becomes hard. But more importantly youo probably are not in the mood anymore to go back to something you deemed done a long time ago. It´s boring, it´s frustrating to open up that mental box again. Learning is harder the longer it takes from event to feedback. Effort can be wasted between event (finishing an issue) and feedback, because other work might go in the wrong direction based on false premises. Checking finished issues for acceptance is the most important task of a Product Owner. It´s even more important than planning new issues. Because as long as work started is not released (accepted) it´s potential waste. So before starting new work better make sure work already done has value. By putting the emphasis on acceptance rather than planning true pull is established. As long as planning and starting work is more important, it´s a push process. Accept a Noon issue on the same day before leaving. Accept a Tonight issue before leaving today or first thing tomorrow morning. Accept a Tomorrow issue tomorrow night before leaving or early the day after tomorrow. After acceptance the developer(s) can start working on the next issue. Flexibility As if reliability/trust and fast feedback for less waste weren´t enough economic incentive, there is flexibility. After each issue the Product Owner can change course. If on Monday morning feature slices A, B, C, D, E were important and A, B, C were scheduled for acceptance by Monday evening and Tuesday evening, the Product Owner can change her mind at any time. Maybe after A got accepted she asks for continuation with D. But maybe, just maybe, she has gotten a completely different idea by then. Maybe she wants work to continue on F. And after B it´s neither D nor E, but G. And after G it´s D. With Spinning every 32 hours at latest priorities can be changed. And nothing is lost. Because what got accepted is of value. It provides an incremental value to the customer/user. Or it provides internal value to the Product Owner as increased knowledge/decreased uncertainty. I find such reactivity over commitment economically very benefical. Why commit a team to some workload for several weeks? It´s unnecessary at beast, and inflexible and wasteful at worst. If we cannot promise delivery of a certain scope on a certain date - which is what customers/management usually want -, we can at least provide them with unpredecented flexibility in the face of high uncertainty. Where the path is not clear, cannot be clear, make small steps so you´re able to change your course at any time. Premature completion Customers/management are used to premeditating budgets. They want to know exactly how much to pay for a certain amount of requirements. That´s understandable. But it does not match with the nature of software development. We should know that by now. Maybe there´s somewhere in the world some team who can consistently deliver on scope, quality, and time, and budget. Great! Congratulations! I, however, haven´t seen such a team yet. Which does not mean it´s impossible, but I think it´s nothing I can recommend to strive for. Rather I´d say: Don´t try this at home. It might hurt you one way or the other. However, what we can do, is allow customers/management stop work on features at any moment. With spinning every 32 hours a feature can be declared as finished - even though it might not be completed according to initial definition. I think, progress over completion is an important offer software development can make. Why think in terms of completion beyond a promise for the next 32 hours? Isn´t it more important to constantly move forward? Step by step. We´re not running sprints, we´re not running marathons, not even ultra-marathons. We´re in the sport of running forever. That makes it futile to stare at the finishing line. The very concept of a burn-down chart is misleading (in most cases). Whoever can only think in terms of completed requirements shuts out the chance for saving money. The requirements for a features mostly are uncertain. So how does a Product Owner know in the first place, how much is needed. Maybe more than specified is needed - which gets uncovered step by step with each finished increment. Maybe less than specified is needed. After each 4–32 hour increment the Product Owner can do an experient (or invite users to an experiment) if a particular trait of the software system is already good enough. And if so, she can switch the attention to a different aspect. In the end, requirements A, B, C then could be finished just 70%, 80%, and 50%. What the heck? It´s good enough - for now. 33% money saved. Wouldn´t that be splendid? Isn´t that a stunning argument for any budget-sensitive customer? You can save money and still get what you need? Pull on practices So far, in addition to more trust, more flexibility, less money spent, Spinning led to “doing less” which also means less code which of course means higher Evolvability per se. Last but not least, though, I think Spinning´s short acceptance cycles have one more effect. They excert pull-power on all sorts of practices known for increasing Evolvability. If, for example, you believe high automated test coverage helps Evolvability by lowering the fear of inadverted damage to a code base, why isn´t 90% of the developer community practicing automated tests consistently? I think, the answer is simple: Because they can do without. Somehow they manage to do enough manual checks before their rare releases/acceptance checks to ensure good enough correctness - at least in the short term. The same goes for other practices like component orientation, continuous build/integration, code reviews etc. None of that is compelling, urgent, imperative. Something else always seems more important. So Evolvability principles and practices fall through the cracks most of the time - until a project hits a wall. Then everybody becomes desperate; but by then (re)gaining Evolvability has become as very, very difficult and tedious undertaking. Sometimes up to the point where the existence of a project/company is in danger. With Spinning that´s different. If you´re practicing Spinning you cannot avoid all those practices. With Spinning you very quickly realize you cannot deliver reliably even on your 32 hour promises. Spinning thus is pulling on developers to adopt principles and practices for Evolvability. They will start actively looking for ways to keep their delivery rate high. And if not, management will soon tell them to do that. Because first the Product Owner then management will notice an increasing difficulty to deliver value within 32 hours. There, finally there emerges a way to measure Evolvability: The more frequent developers tell the Product Owner there is no way to deliver anything worth of feedback until tomorrow night, the poorer Evolvability is. Don´t count the “WTF!”, count the “No way!” utterances. In closing For sustainable software development we need to put Evolvability first. Functionality and Quality must not rule software development but be implemented within a framework ensuring (enough) Evolvability. Since Evolvability cannot be measured easily, I think we need to put software development “under pressure”. Software needs to be changed more often, in smaller increments. Each increment being relevant to the customer/user in some way. That does not mean each increment is worthy of shipment. It´s sufficient to gain further insight from it. Increments primarily serve the reduction of uncertainty, not sales. Sales even needs to be decoupled from this incremental progress. No more promises to sales. No more delivery au point. Rather sales should look at a stream of accepted increments (or incremental releases) and scoup from that whatever they find valuable. Sales and marketing need to realize they should work on what´s there, not what might be possible in the future. But I digress… In my view a Spinning cycle - which is not easy to reach, which requires practice - is the core practice to compensate the immeasurability of Evolvability. From start to finish of each issue in 32 hours max - that´s the challenge we need to accept if we´re serious increasing Evolvability. Fortunately higher Evolvability is not the only outcome of Spinning. Customer/management will like the increased flexibility and “getting more bang for the buck”.

Read the article
C#/.NET Little Wonders: The ConcurrentDictionary

- by James Michael Hare

Once again we consider some of the lesser known classes and keywords of C#. In this series of posts, we will discuss how the concurrent collections have been developed to help alleviate these multi-threading concerns. Last week’s post began with a general introduction and discussed the ConcurrentStack<T> and ConcurrentQueue<T>. Today's post discusses the ConcurrentDictionary<T> (originally I had intended to discuss ConcurrentBag this week as well, but ConcurrentDictionary had enough information to create a very full post on its own!). Finally next week, we shall close with a discussion of the ConcurrentBag<T> and BlockingCollection<T>. For more of the "Little Wonders" posts, see the index here. Recap As you'll recall from the previous post, the original collections were object-based containers that accomplished synchronization through a Synchronized member. While these were convenient because you didn't have to worry about writing your own synchronization logic, they were a bit too finely grained and if you needed to perform multiple operations under one lock, the automatic synchronization didn't buy much. With the advent of .NET 2.0, the original collections were succeeded by the generic collections which are fully type-safe, but eschew automatic synchronization. This cuts both ways in that you have a lot more control as a developer over when and how fine-grained you want to synchronize, but on the other hand if you just want simple synchronization it creates more work. With .NET 4.0, we get the best of both worlds in generic collections. A new breed of collections was born called the concurrent collections in the System.Collections.Concurrent namespace. These amazing collections are fine-tuned to have best overall performance for situations requiring concurrent access. They are not meant to replace the generic collections, but to simply be an alternative to creating your own locking mechanisms. Among those concurrent collections were the ConcurrentStack<T> and ConcurrentQueue<T> which provide classic LIFO and FIFO collections with a concurrent twist. As we saw, some of the traditional methods that required calls to be made in a certain order (like checking for not IsEmpty before calling Pop()) were replaced in favor of an umbrella operation that combined both under one lock (like TryPop()). Now, let's take a look at the next in our series of concurrent collections!For some excellent information on the performance of the concurrent collections and how they perform compared to a traditional brute-force locking strategy, see this wonderful whitepaper by the Microsoft Parallel Computing Platform team here. ConcurrentDictionary – the fully thread-safe dictionary The ConcurrentDictionary<TKey,TValue> is the thread-safe counterpart to the generic Dictionary<TKey, TValue> collection. Obviously, both are designed for quick – O(1) – lookups of data based on a key. If you think of algorithms where you need lightning fast lookups of data and don’t care whether the data is maintained in any particular ordering or not, the unsorted dictionaries are generally the best way to go. Note: as a side note, there are sorted implementations of IDictionary, namely SortedDictionary and SortedList which are stored as an ordered tree and a ordered list respectively. While these are not as fast as the non-sorted dictionaries – they are O(log2 n) – they are a great combination of both speed and ordering -- and still greatly outperform a linear search. Now, once again keep in mind that if all you need to do is load a collection once and then allow multi-threaded reading you do not need any locking. Examples of this tend to be situations where you load a lookup or translation table once at program start, then keep it in memory for read-only reference. In such cases locking is completely non-productive. However, most of the time when we need a concurrent dictionary we are interleaving both reads and updates. This is where the ConcurrentDictionary really shines! It achieves its thread-safety with no common lock to improve efficiency. It actually uses a series of locks to provide concurrent updates, and has lockless reads! This means that the ConcurrentDictionary gets even more efficient the higher the ratio of reads-to-writes you have. ConcurrentDictionary and Dictionary differences For the most part, the ConcurrentDictionary<TKey,TValue> behaves like it’s Dictionary<TKey,TValue> counterpart with a few differences. Some notable examples of which are: Add() does not exist in the concurrent dictionary. This means you must use TryAdd(), AddOrUpdate(), or GetOrAdd(). It also means that you can’t use a collection initializer with the concurrent dictionary. TryAdd() replaced Add() to attempt atomic, safe adds. Because Add() only succeeds if the item doesn’t already exist, we need an atomic operation to check if the item exists, and if not add it while still under an atomic lock. TryUpdate() was added to attempt atomic, safe updates. If we want to update an item, we must make sure it exists first and that the original value is what we expected it to be. If all these are true, we can update the item under one atomic step. TryRemove() was added to attempt atomic, safe removes. To safely attempt to remove a value we need to see if the key exists first, this checks for existence and removes under an atomic lock. AddOrUpdate() was added to attempt an thread-safe “upsert”. There are many times where you want to insert into a dictionary if the key doesn’t exist, or update the value if it does. This allows you to make a thread-safe add-or-update. GetOrAdd() was added to attempt an thread-safe query/insert. Sometimes, you want to query for whether an item exists in the cache, and if it doesn’t insert a starting value for it. This allows you to get the value if it exists and insert if not. Count, Keys, Values properties take a snapshot of the dictionary. Accessing these properties may interfere with add and update performance and should be used with caution. ToArray() returns a static snapshot of the dictionary. That is, the dictionary is locked, and then copied to an array as a O(n) operation. GetEnumerator() is thread-safe and efficient, but allows dirty reads. Because reads require no locking, you can safely iterate over the contents of the dictionary. The only downside is that, depending on timing, you may get dirty reads. Dirty reads during iteration The last point on GetEnumerator() bears some explanation. Picture a scenario in which you call GetEnumerator() (or iterate using a foreach, etc.) and then, during that iteration the dictionary gets updated. This may not sound like a big deal, but it can lead to inconsistent results if used incorrectly. The problem is that items you already iterated over that are updated a split second after don’t show the update, but items that you iterate over that were updated a split second before do show the update. Thus you may get a combination of items that are “stale” because you iterated before the update, and “fresh” because they were updated after GetEnumerator() but before the iteration reached them. Let’s illustrate with an example, let’s say you load up a concurrent dictionary like this: 1: // load up a dictionary. 2: var dictionary = new ConcurrentDictionary<string, int>(); 3: 4: dictionary["A"] = 1; 5: dictionary["B"] = 2; 6: dictionary["C"] = 3; 7: dictionary["D"] = 4; 8: dictionary["E"] = 5; 9: dictionary["F"] = 6; Then you have one task (using the wonderful TPL!) to iterate using dirty reads: 1: // attempt iteration in a separate thread 2: var iterationTask = new Task(() => 3: { 4: // iterates using a dirty read 5: foreach (var pair in dictionary) 6: { 7: Console.WriteLine(pair.Key + ":" + pair.Value); 8: } 9: }); And one task to attempt updates in a separate thread (probably): 1: // attempt updates in a separate thread 2: var updateTask = new Task(() => 3: { 4: // iterates, and updates the value by one 5: foreach (var pair in dictionary) 6: { 7: dictionary[pair.Key] = pair.Value + 1; 8: } 9: }); Now that we’ve done this, we can fire up both tasks and wait for them to complete: 1: // start both tasks 2: updateTask.Start(); 3: iterationTask.Start(); 4: 5: // wait for both to complete. 6: Task.WaitAll(updateTask, iterationTask); Now, if I you didn’t know about the dirty reads, you may have expected to see the iteration before the updates (such as A:1, B:2, C:3, D:4, E:5, F:6). However, because the reads are dirty, we will quite possibly get a combination of some updated, some original. My own run netted this result: 1: F:6 2: E:6 3: D:5 4: C:4 5: B:3 6: A:2 Note that, of course, iteration is not in order because ConcurrentDictionary, like Dictionary, is unordered. Also note that both E and F show the value 6. This is because the output task reached F before the update, but the updates for the rest of the items occurred before their output (probably because console output is very slow, comparatively). If we want to always guarantee that we will get a consistent snapshot to iterate over (that is, at the point we ask for it we see precisely what is in the dictionary and no subsequent updates during iteration), we should iterate over a call to ToArray() instead: 1: // attempt iteration in a separate thread 2: var iterationTask = new Task(() => 3: { 4: // iterates using a dirty read 5: foreach (var pair in dictionary.ToArray()) 6: { 7: Console.WriteLine(pair.Key + ":" + pair.Value); 8: } 9: }); The atomic Try…() methods As you can imagine TryAdd() and TryRemove() have few surprises. Both first check the existence of the item to determine if it can be added or removed based on whether or not the key currently exists in the dictionary: 1: // try add attempts an add and returns false if it already exists 2: if (dictionary.TryAdd("G", 7)) 3: Console.WriteLine("G did not exist, now inserted with 7"); 4: else 5: Console.WriteLine("G already existed, insert failed."); TryRemove() also has the virtue of returning the value portion of the removed entry matching the given key: 1: // attempt to remove the value, if it exists it is removed and the original is returned 2: int removedValue; 3: if (dictionary.TryRemove("C", out removedValue)) 4: Console.WriteLine("Removed C and its value was " + removedValue); 5: else 6: Console.WriteLine("C did not exist, remove failed."); Now TryUpdate() is an interesting creature. You might think from it’s name that TryUpdate() first checks for an item’s existence, and then updates if the item exists, otherwise it returns false. Well, note quite... It turns out when you call TryUpdate() on a concurrent dictionary, you pass it not only the new value you want it to have, but also the value you expected it to have before the update. If the item exists in the dictionary, and it has the value you expected, it will update it to the new value atomically and return true. If the item is not in the dictionary or does not have the value you expected, it is not modified and false is returned. 1: // attempt to update the value, if it exists and if it has the expected original value 2: if (dictionary.TryUpdate("G", 42, 7)) 3: Console.WriteLine("G existed and was 7, now it's 42."); 4: else 5: Console.WriteLine("G either didn't exist, or wasn't 7."); The composite Add methods The ConcurrentDictionary also has composite add methods that can be used to perform updates and gets, with an add if the item is not existing at the time of the update or get. The first of these, AddOrUpdate(), allows you to add a new item to the dictionary if it doesn’t exist, or update the existing item if it does. For example, let’s say you are creating a dictionary of counts of stock ticker symbols you’ve subscribed to from a market data feed: 1: public sealed class SubscriptionManager 2: { 3: private readonly ConcurrentDictionary<string, int> _subscriptions = new ConcurrentDictionary<string, int>(); 4: 5: // adds a new subscription, or increments the count of the existing one. 6: public void AddSubscription(string tickerKey) 7: { 8: // add a new subscription with count of 1, or update existing count by 1 if exists 9: var resultCount = _subscriptions.AddOrUpdate(tickerKey, 1, (symbol, count) => count + 1); 10: 11: // now check the result to see if we just incremented the count, or inserted first count 12: if (resultCount == 1) 13: { 14: // subscribe to symbol... 15: } 16: } 17: } Notice the update value factory Func delegate. If the key does not exist in the dictionary, the add value is used (in this case 1 representing the first subscription for this symbol), but if the key already exists, it passes the key and current value to the update delegate which computes the new value to be stored in the dictionary. The return result of this operation is the value used (in our case: 1 if added, existing value + 1 if updated). Likewise, the GetOrAdd() allows you to attempt to retrieve a value from the dictionary, and if the value does not currently exist in the dictionary it will insert a value. This can be handy in cases where perhaps you wish to cache data, and thus you would query the cache to see if the item exists, and if it doesn’t you would put the item into the cache for the first time: 1: public sealed class PriceCache 2: { 3: private readonly ConcurrentDictionary<string, double> _cache = new ConcurrentDictionary<string, double>(); 4: 5: // adds a new subscription, or increments the count of the existing one. 6: public double QueryPrice(string tickerKey) 7: { 8: // check for the price in the cache, if it doesn't exist it will call the delegate to create value. 9: return _cache.GetOrAdd(tickerKey, symbol => GetCurrentPrice(symbol)); 10: } 11: 12: private double GetCurrentPrice(string tickerKey) 13: { 14: // do code to calculate actual true price. 15: } 16: } There are other variations of these two methods which vary whether a value is provided or a factory delegate, but otherwise they work much the same. Oddities with the composite Add methods The AddOrUpdate() and GetOrAdd() methods are totally thread-safe, on this you may rely, but they are not atomic. It is important to note that the methods that use delegates execute those delegates outside of the lock. This was done intentionally so that a user delegate (of which the ConcurrentDictionary has no control of course) does not take too long and lock out other threads. This is not necessarily an issue, per se, but it is something you must consider in your design. The main thing to consider is that your delegate may get called to generate an item, but that item may not be the one returned! Consider this scenario: A calls GetOrAdd and sees that the key does not currently exist, so it calls the delegate. Now thread B also calls GetOrAdd and also sees that the key does not currently exist, and for whatever reason in this race condition it’s delegate completes first and it adds its new value to the dictionary. Now A is done and goes to get the lock, and now sees that the item now exists. In this case even though it called the delegate to create the item, it will pitch it because an item arrived between the time it attempted to create one and it attempted to add it. Let’s illustrate, assume this totally contrived example program which has a dictionary of char to int. And in this dictionary we want to store a char and it’s ordinal (that is, A = 1, B = 2, etc). So for our value generator, we will simply increment the previous value in a thread-safe way (perhaps using Interlocked): 1: public static class Program 2: { 3: private static int _nextNumber = 0; 4: 5: // the holder of the char to ordinal 6: private static ConcurrentDictionary<char, int> _dictionary 7: = new ConcurrentDictionary<char, int>(); 8: 9: // get the next id value 10: public static int NextId 11: { 12: get { return Interlocked.Increment(ref _nextNumber); } 13: } Then, we add a method that will perform our insert: 1: public static void Inserter() 2: { 3: for (int i = 0; i < 26; i++) 4: { 5: _dictionary.GetOrAdd((char)('A' + i), key => NextId); 6: } 7: } Finally, we run our test by starting two tasks to do this work and get the results… 1: public static void Main() 2: { 3: // 3 tasks attempting to get/insert 4: var tasks = new List<Task> 5: { 6: new Task(Inserter), 7: new Task(Inserter) 8: }; 9: 10: tasks.ForEach(t => t.Start()); 11: Task.WaitAll(tasks.ToArray()); 12: 13: foreach (var pair in _dictionary.OrderBy(p => p.Key)) 14: { 15: Console.WriteLine(pair.Key + ":" + pair.Value); 16: } 17: } If you run this with only one task, you get the expected A:1, B:2, ..., Z:26. But running this in parallel you will get something a bit more complex. My run netted these results: 1: A:1 2: B:3 3: C:4 4: D:5 5: E:6 6: F:7 7: G:8 8: H:9 9: I:10 10: J:11 11: K:12 12: L:13 13: M:14 14: N:15 15: O:16 16: P:17 17: Q:18 18: R:19 19: S:20 20: T:21 21: U:22 22: V:23 23: W:24 24: X:25 25: Y:26 26: Z:27 Notice that B is 3? This is most likely because both threads attempted to call GetOrAdd() at roughly the same time and both saw that B did not exist, thus they both called the generator and one thread got back 2 and the other got back 3. However, only one of those threads can get the lock at a time for the actual insert, and thus the one that generated the 3 won and the 3 was inserted and the 2 got discarded. This is why on these methods your factory delegates should be careful not to have any logic that would be unsafe if the value they generate will be pitched in favor of another item generated at roughly the same time. As such, it is probably a good idea to keep those generators as stateless as possible. Summary The ConcurrentDictionary is a very efficient and thread-safe version of the Dictionary generic collection. It has all the benefits of type-safety that it’s generic collection counterpart does, and in addition is extremely efficient especially when there are more reads than writes concurrently. Tweet Technorati Tags: C#, .NET, Concurrent Collections, Collections, Little Wonders, Black Rabbit Coder,James Michael Hare

Read the article
Authoritative sources about Database vs. Flatfile decision

- by FastAl

<tldr>looking for a reference to a book or other undeniably authoritative source that gives reasons when you should choose a database vs. when you should choose other storage methods. I have provided an un-authoritative list of reasons about 2/3 of the way down this post.</tldr> I have a situation at my company where a database is being used where it would be better to use another solution (in this case, an auto-generated piece of source code that contains a static lookup table, searched by binary sort). Normally, a database would be an OK solution even though the problem does not require a database, e.g, none of the elements of ACID are needed, as it is read-only data, updated about every 3-5 years (also requiring other sourcecode changes), and fits in memory, and can be keyed into via binary search (a tad faster than db, but speed is not an issue). The problem is that this code runs on our enterprise server, but is shared with several PC platforms (some disconnected, some use a central DB, etc.), and parts of it are managed by multiple programming units, parts by the DBAs, parts even by mathematicians in another department, etc. These hit their own platform’s version of their databases (containing their own copy of the static data). What happens is that every implementation, every little change, something different goes wrong. There are many other issues as well. I can’t even use a flatfile, because one mode of running on our enterprise server does not have permission to read files (only databases, and of course, its own literal storage, e.g., in-source table). Of course, other parts of the system use databases in proper, less obscure manners; there is no problem with those parts. So why don’t we just change it? I don’t have administrative ability to force a change. But I’m affected because sometimes I have to help fix the problems, but mostly because it causes outages and tons of extra IT time by other programmers and d*mmit that makes me mad! The reason neither management, nor the designers of the system, can see the problem is that they propose a solution that won’t work: increase communication; implement more safeguards and standards; etc. But every time, in a different part of the already-pared-down but still multi-step processes, a few different diligent, hard-working, top performing IT personnel make a unique subtle error that causes it to fail, sometimes after the last round of testing! And in general these are not single-person failures, but understandable miscommunications. And communication at our company is actually better than most. People just don't think that's the case because they haven't dug into the matter. However, I have it on very good word from somebody with extensive formal study of sociology and psychology that the relatively small amount of less-than-proper database usage in this gigantic cross-platform multi-source, multi-language project is bureaucratically un-maintainable. Impossible. No chance. At least with Human Beings in the loop, and it can’t be automated. In addition, the management and developers who could change this, though intelligent and capable, don’t understand the rigidity of this ‘how humans are’ issue, and are not convincible on the matter. The reason putting the static data in sourcecode will solve the problem is, although the solution is less sexy than a database, it would function with no technical drawbacks; and since the sharing of sourcecode already works very well, you basically erase any database-related effort from this section of the project, along with all the drawbacks of it that are causing problems. OK, that’s the background, for the curious. I won’t be able to convince management that this is an unfixable sociological problem, and that the real solution is coding around these limits of human nature, just as you would code around a bug in a 3rd party component that you can’t change. So what I have to do is exploit the unsuitableness of the database solution, and not do it using logic, but rather authority. I am aware of many reasons, and posts on this site giving reasons for one over the other; I’m not looking for lists of reasons like these (although you can add a comment if I've miss a doozy): WHY USE A DATABASE? instead of flatfile/other DB vs. file: if you need... Random Read / Transparent search optimization Advanced / varied / customizable Searching and sorting capabilities Transaction/rollback Locks, semaphores Concurrency control / Shared users Security 1-many/m-m is easier Easy modification Scalability Load Balancing Random updates / inserts / deletes Advanced query Administrative control of design, etc. SQL / learning curve Debugging / Logging Centralized / Live Backup capabilities Cached queries / dvlp & cache execution plans Interleaved update/read Referential integrity, avoid redundant/missing/corrupt/out-of-sync data Reporting (from on olap or oltp db) / turnkey generation tools [Disadvantages:] Important to get right the first time - professional design - but only b/c it's meant to last s/w & h/w cost Usu. over a network, speed issue (best vs. best design vs. local=even then a separate process req's marshalling/netwk layers/inter-p comm) indicies and query processing can stand in the way of simple processing (vs. flatfile) WHY USE FLATFILE: If you only need... Sequential Row processing only Limited usage append only (no reading, no master key/update) Only Update the record you're reading (fixed length recs only) Too big to fit into memory If Local disk / read-ahead network connection Portability / small system Email / cut & Paste / store as document by novice - simple format Low design learning curve but high cost later WHY USE IN-MEMORY/TABLE (tables, arrays, etc.): if you need... Processing a single db/ff record that was imported Known size of data Static data if hardcoding the table Narrow, unchanging use (e.g., one program or proc) -includes a class that will be shared, but encapsulates its data manipulation Extreme speed needed / high transaction frequency Random access - but search is dependent on implementation Following are some other posts about the topic: http://stackoverflow.com/questions/1499239/database-vs-flat-text-file-what-are-some-technical-reasons-for-choosing-one-over http://stackoverflow.com/questions/332825/are-flat-file-databases-any-good http://stackoverflow.com/questions/2356851/database-vs-flat-files http://stackoverflow.com/questions/514455/databases-vs-plain-text/514530 What I’d like to know is if anybody could recommend a hard, authoritative source containing these reasons. I’m looking for a paper book I can buy, or a reputable website with whitepapers about the issue (e.g., Microsoft, IBM), not counting the user-generated content on those sites. This will have a greater change to elicit a change that I’m looking for: less wasted programmer time, and more reliable programs. Thanks very much for your help. You win a prize for reading such a large post!

Read the article
How do I prove I should put a table of values in source code instead of a database table?

- by FastAl

<tldr>looking for a reference to a book or other undeniably authoritative source that gives reasons when you should choose a database vs. when you should choose other storage methods. I have provided an un-authoritative list of reasons about 2/3 of the way down this post.</tldr> I have a situation at my company where a database is being used where it would be better to use another solution (in this case, an auto-generated piece of source code that contains a static lookup table, searched by binary sort). Normally, a database would be an OK solution even though the problem does not require a database, e.g, none of the elements of ACID are needed, as it is read-only data, updated about every 3-5 years (also requiring other sourcecode changes), and fits in memory, and can be keyed into via binary search (a tad faster than db, but speed is not an issue). The problem is that this code runs on our enterprise server, but is shared with several PC platforms (some disconnected, some use a central DB, etc.), and parts of it are managed by multiple programming units, parts by the DBAs, parts even by mathematicians in another department, etc. These hit their own platform’s version of their databases (containing their own copy of the static data). What happens is that every implementation, every little change, something different goes wrong. There are many other issues as well. I can’t even use a flatfile, because one mode of running on our enterprise server does not have permission to read files (only databases, and of course, its own literal storage, e.g., in-source table). Of course, other parts of the system use databases in proper, less obscure manners; there is no problem with those parts. So why don’t we just change it? I don’t have administrative ability to force a change. But I’m affected because sometimes I have to help fix the problems, but mostly because it causes outages and tons of extra IT time by other programmers and d*mmit that makes me mad! The reason neither management, nor the designers of the system, can see the problem is that they propose a solution that won’t work: increase communication; implement more safeguards and standards; etc. But every time, in a different part of the already-pared-down but still multi-step processes, a few different diligent, hard-working, top performing IT personnel make a unique subtle error that causes it to fail, sometimes after the last round of testing! And in general these are not single-person failures, but understandable miscommunications. And communication at our company is actually better than most. People just don't think that's the case because they haven't dug into the matter. However, I have it on very good word from somebody with extensive formal study of sociology and psychology that the relatively small amount of less-than-proper database usage in this gigantic cross-platform multi-source, multi-language project is bureaucratically un-maintainable. Impossible. No chance. At least with Human Beings in the loop, and it can’t be automated. In addition, the management and developers who could change this, though intelligent and capable, don’t understand the rigidity of this ‘how humans are’ issue, and are not convincible on the matter. The reason putting the static data in sourcecode will solve the problem is, although the solution is less sexy than a database, it would function with no technical drawbacks; and since the sharing of sourcecode already works very well, you basically erase any database-related effort from this section of the project, along with all the drawbacks of it that are causing problems. OK, that’s the background, for the curious. I won’t be able to convince management that this is an unfixable sociological problem, and that the real solution is coding around these limits of human nature, just as you would code around a bug in a 3rd party component that you can’t change. So what I have to do is exploit the unsuitableness of the database solution, and not do it using logic, but rather authority. I am aware of many reasons, and posts on this site giving reasons for one over the other; I’m not looking for lists of reasons like these (although you can add a comment if I've miss a doozy): WHY USE A DATABASE? instead of flatfile/other DB vs. file: if you need... Random Read / Transparent search optimization Advanced / varied / customizable Searching and sorting capabilities Transaction/rollback Locks, semaphores Concurrency control / Shared users Security 1-many/m-m is easier Easy modification Scalability Load Balancing Random updates / inserts / deletes Advanced query Administrative control of design, etc. SQL / learning curve Debugging / Logging Centralized / Live Backup capabilities Cached queries / dvlp & cache execution plans Interleaved update/read Referential integrity, avoid redundant/missing/corrupt/out-of-sync data Reporting (from on olap or oltp db) / turnkey generation tools [Disadvantages:] Important to get right the first time - professional design - but only b/c it's meant to last s/w & h/w cost Usu. over a network, speed issue (best vs. best design vs. local=even then a separate process req's marshalling/netwk layers/inter-p comm) indicies and query processing can stand in the way of simple processing (vs. flatfile) WHY USE FLATFILE: If you only need... Sequential Row processing only Limited usage append only (no reading, no master key/update) Only Update the record you're reading (fixed length recs only) Too big to fit into memory If Local disk / read-ahead network connection Portability / small system Email / cut & Paste / store as document by novice - simple format Low design learning curve but high cost later WHY USE IN-MEMORY/TABLE (tables, arrays, etc.): if you need... Processing a single db/ff record that was imported Known size of data Static data if hardcoding the table Narrow, unchanging use (e.g., one program or proc) -includes a class that will be shared, but encapsulates its data manipulation Extreme speed needed / high transaction frequency Random access - but search is dependent on implementation Following are some other posts about the topic: http://stackoverflow.com/questions/1499239/database-vs-flat-text-file-what-are-some-technical-reasons-for-choosing-one-over http://stackoverflow.com/questions/332825/are-flat-file-databases-any-good http://stackoverflow.com/questions/2356851/database-vs-flat-files http://stackoverflow.com/questions/514455/databases-vs-plain-text/514530 What I’d like to know is if anybody could recommend a hard, authoritative source containing these reasons. I’m looking for a paper book I can buy, or a reputable website with whitepapers about the issue (e.g., Microsoft, IBM), not counting the user-generated content on those sites. This will have a greater change to elicit a change that I’m looking for: less wasted programmer time, and more reliable programs. Thanks very much for your help. You win a prize for reading such a large post!

Read the article
Matlab code works with one version but not the other

- by user1325655

I have a code that works in Matlab version R2010a but shows errors in matlab R2008a. I am trying to implement a self organizing fuzzy neural network with extended kalman filter. I have the code running but it only works in matlab version R2010a. It doesn't work with other versions. Any help? Code attach function [ c, sigma , W_output ] = SOFNN( X, d, Kd ) %SOFNN Self-Organizing Fuzzy Neural Networks %Input Parameters % X(r,n) - rth traning data from nth observation % d(n) - the desired output of the network (must be a row vector) % Kd(r) - predefined distance threshold for the rth input %Output Parameters % c(IndexInputVariable,IndexNeuron) % sigma(IndexInputVariable,IndexNeuron) % W_output is a vector %Setting up Parameters for SOFNN SigmaZero=4; delta=0.12; threshold=0.1354; k_sigma=1.12; %For more accurate results uncomment the following %format long; %Implementation of a SOFNN model [size_R,size_N]=size(X); %size_R - the number of input variables c=[]; sigma=[]; W_output=[]; u=0; % the number of neurons in the structure Q=[]; O=[]; Psi=[]; for n=1:size_N x=X(:,n); if u==0 % No neuron in the structure? c=x; sigma=SigmaZero*ones(size_R,1); u=1; Psi=GetMePsi(X,c,sigma); [Q,O] = UpdateStructure(X,Psi,d); pT_n=GetMeGreatPsi(x,Psi(n,:))'; else [Q,O,pT_n] = UpdateStructureRecursively(X,Psi,Q,O,d,n); end; KeepSpinning=true; while KeepSpinning %Calculate the error and if-part criteria ae=abs(d(n)-pT_n*O); %approximation error [phi,~]=GetMePhi(x,c,sigma); [maxphi,maxindex]=max(phi); % maxindex refers to the neuron's index if ae>delta if maxphi<threshold %enlarge width [minsigma,minindex]=min(sigma(:,maxindex)); sigma(minindex,maxindex)=k_sigma*minsigma; Psi=GetMePsi(X,c,sigma); [Q,O] = UpdateStructure(X,Psi,d); pT_n=GetMeGreatPsi(x,Psi(n,:))'; else %Add a new neuron and update structure ctemp=[]; sigmatemp=[]; dist=0; for r=1:size_R dist=abs(x(r)-c(r,1)); distIndex=1; for j=2:u if abs(x(r)-c(r,j))<dist distIndex=j; dist=abs(x(r)-c(r,j)); end; end; if dist<=Kd(r) ctemp=[ctemp; c(r,distIndex)]; sigmatemp=[sigmatemp ; sigma(r,distIndex)]; else ctemp=[ctemp; x(r)]; sigmatemp=[sigmatemp ; dist]; end; end; c=[c ctemp]; sigma=[sigma sigmatemp]; Psi=GetMePsi(X,c,sigma); [Q,O] = UpdateStructure(X,Psi,d); KeepSpinning=false; u=u+1; end; else if maxphi<threshold %enlarge width [minsigma,minindex]=min(sigma(:,maxindex)); sigma(minindex,maxindex)=k_sigma*minsigma; Psi=GetMePsi(X,c,sigma); [Q,O] = UpdateStructure(X,Psi,d); pT_n=GetMeGreatPsi(x,Psi(n,:))'; else %Do nothing and exit the while KeepSpinning=false; end; end; end; end; W_output=O; end function [Q_next, O_next,pT_n] = UpdateStructureRecursively(X,Psi,Q,O,d,n) %O=O(t-1) O_next=O(t) p_n=GetMeGreatPsi(X(:,n),Psi(n,:)); pT_n=p_n'; ee=abs(d(n)-pT_n*O); %|e(t)| temp=1+pT_n*Q*p_n; ae=abs(ee/temp); if ee>=ae L=Q*p_n*(temp)^(-1); Q_next=(eye(length(Q))-L*pT_n)*Q; O_next=O + L*ee; else Q_next=eye(length(Q))*Q; O_next=O; end; end function [ Q , O ] = UpdateStructure(X,Psi,d) GreatPsiBig = GetMeGreatPsi(X,Psi); %M=u*(r+1) %n - the number of observations [M,~]=size(GreatPsiBig); %Others Ways of getting Q=[P^T(t)*P(t)]^-1 %************************************************************************** %opts.SYM = true; %Q = linsolve(GreatPsiBig*GreatPsiBig',eye(M),opts); % %Q = inv(GreatPsiBig*GreatPsiBig'); %Q = pinv(GreatPsiBig*GreatPsiBig'); %************************************************************************** Y=GreatPsiBig\eye(M); Q=GreatPsiBig'\Y; O=Q*GreatPsiBig*d'; end %This function works too with x % (X=X and Psi is a Matrix) - Gets you the whole GreatPsi % (X=x and Psi is the row related to x) - Gets you just the column related with the observation function [GreatPsi] = GetMeGreatPsi(X,Psi) %Psi - In a row you go through the neurons and in a column you go through number of %observations **** Psi(#obs,IndexNeuron) **** GreatPsi=[]; [N,U]=size(Psi); for n=1:N x=X(:,n); GreatPsiCol=[]; for u=1:U GreatPsiCol=[ GreatPsiCol ; Psi(n,u)*[1; x] ]; end; GreatPsi=[GreatPsi GreatPsiCol]; end; end function [phi, SumPhi]=GetMePhi(x,c,sigma) [r,u]=size(c); %u - the number of neurons in the structure %r - the number of input variables phi=[]; SumPhi=0; for j=1:u % moving through the neurons S=0; for i=1:r % moving through the input variables S = S + ((x(i) - c(i,j))^2) / (2*sigma(i,j)^2); end; phi = [phi exp(-S)]; SumPhi = SumPhi + phi(j); %phi(u)=exp(-S) end; end %This function works too with x, it will give you the row related to x function [Psi] = GetMePsi(X,c,sigma) [~,u]=size(c); [~,size_N]=size(X); %u - the number of neurons in the structure %size_N - the number of observations Psi=[]; for n=1:size_N [phi, SumPhi]=GetMePhi(X(:,n),c,sigma); PsiTemp=[]; for j=1:u %PsiTemp is a row vector ex: [1 2 3] PsiTemp(j)=phi(j)/SumPhi; end; Psi=[Psi; PsiTemp]; %Psi - In a row you go through the neurons and in a column you go through number of %observations **** Psi(#obs,IndexNeuron) **** end; end

Read the article
WCF Service returning 400 error: The body of the message cannot be read because it is empty

- by Josh

I have a WCF service that is causing a bit of a headache. I have tracing enabled, I have an object with a data contract being built and passed in, but I am seeing this error in the log: <TraceData> <DataItem> <TraceRecord xmlns="http://schemas.microsoft.com/2004/10/E2ETraceEvent/TraceRecord" Severity="Error"> <TraceIdentifier>http://msdn.microsoft.com/en-US/library/System.ServiceModel.Diagnostics.ThrowingException.aspx</TraceIdentifier> <Description>Throwing an exception.</Description> <AppDomain>efb0d0d7-1-129315381593520544</AppDomain> <Exception> <ExceptionType>System.ServiceModel.ProtocolException, System.ServiceModel, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089</ExceptionType> <Message>There is a problem with the XML that was received from the network. See inner exception for more details.</Message> <StackTrace> at System.ServiceModel.Channels.HttpRequestContext.CreateMessage() at System.ServiceModel.Channels.HttpChannelListener.HttpContextReceived(HttpRequestContext context, Action callback) at System.ServiceModel.Activation.HostedHttpTransportManager.HttpContextReceived(HostedHttpRequestAsyncResult result) at System.ServiceModel.Activation.HostedHttpRequestAsyncResult.HandleRequest() at System.ServiceModel.Activation.HostedHttpRequestAsyncResult.BeginRequest() at System.ServiceModel.Activation.HostedHttpRequestAsyncResult.OnBeginRequest(Object state) at System.Runtime.IOThreadScheduler.ScheduledOverlapped.IOCallback(UInt32 errorCode, UInt32 numBytes, NativeOverlapped* nativeOverlapped) at System.Runtime.Fx.IOCompletionThunk.UnhandledExceptionFrame(UInt32 error, UInt32 bytesRead, NativeOverlapped* nativeOverlapped) at System.Threading._IOCompletionCallback.PerformIOCompletionCallback(UInt32 errorCode, UInt32 numBytes, NativeOverlapped* pOVERLAP) </StackTrace> <ExceptionString> System.ServiceModel.ProtocolException: There is a problem with the XML that was received from the network. See inner exception for more details. ---&gt; System.Xml.XmlException: The body of the message cannot be read because it is empty. --- End of inner exception stack trace --- </ExceptionString> <InnerException> <ExceptionType>System.Xml.XmlException, System.Xml, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089</ExceptionType> <Message>The body of the message cannot be read because it is empty.</Message> <StackTrace> at System.ServiceModel.Channels.HttpRequestContext.CreateMessage() at System.ServiceModel.Channels.HttpChannelListener.HttpContextReceived(HttpRequestContext context, Action callback) at System.ServiceModel.Activation.HostedHttpTransportManager.HttpContextReceived(HostedHttpRequestAsyncResult result) at System.ServiceModel.Activation.HostedHttpRequestAsyncResult.HandleRequest() at System.ServiceModel.Activation.HostedHttpRequestAsyncResult.BeginRequest() at System.ServiceModel.Activation.HostedHttpRequestAsyncResult.OnBeginRequest(Object state) at System.Runtime.IOThreadScheduler.ScheduledOverlapped.IOCallback(UInt32 errorCode, UInt32 numBytes, NativeOverlapped* nativeOverlapped) at System.Runtime.Fx.IOCompletionThunk.UnhandledExceptionFrame(UInt32 error, UInt32 bytesRead, NativeOverlapped* nativeOverlapped) at System.Threading._IOCompletionCallback.PerformIOCompletionCallback(UInt32 errorCode, UInt32 numBytes, NativeOverlapped* pOVERLAP) </StackTrace> <ExceptionString>System.Xml.XmlException: The body of the message cannot be read because it is empty.</ExceptionString> </InnerException> </Exception> </TraceRecord> </DataItem> </TraceData> So, here is my service interface: [ServiceContract] public interface IRDCService { [OperationContract] Response<Customer> GetCustomer(CustomerRequest request); [OperationContract] Response<Customer> GetSiteCustomers(CustomerRequest request); } And here is my service instance public class RDCService : IRDCService { ICustomerService customerService; public RDCService() { //We have to locate the instance from structuremap manually because web services *REQUIRE* a default constructor customerService = ServiceLocator.Locate<ICustomerService>(); } public Response<Customer> GetCustomer(CustomerRequest request) { return customerService.GetCustomer(request); } public Response<Customer> GetSiteCustomers(CustomerRequest request) { return customerService.GetSiteCustomers(request); } } The configuration for the web service (server side) looks like this: <system.serviceModel> <diagnostics> <messageLogging logMalformedMessages="true" logMessagesAtServiceLevel="true" logMessagesAtTransportLevel="true" /> </diagnostics> <services> <service behaviorConfiguration="MySite.Web.Services.RDCServiceBehavior" name="MySite.Web.Services.RDCService"> <endpoint address="http://localhost:27433" binding="wsHttpBinding" contract="MySite.Common.Services.Web.IRDCService"> <identity> <dns value="localhost:27433" /> </identity> </endpoint> <endpoint address="mex" binding="mexHttpBinding" contract="IMetadataExchange" /> </service> </services> <behaviors> <serviceBehaviors> <behavior name="MySite.Web.Services.RDCServiceBehavior">  <serviceMetadata httpGetEnabled="true"/>  <serviceDebug includeExceptionDetailInFaults="true"/> <dataContractSerializer maxItemsInObjectGraph="6553600" /> </behavior> </serviceBehaviors> </behaviors> </system.serviceModel> Here is what my request object looks like [DataContract] public class CustomerRequest : RequestBase { [DataMember] public int Id { get; set; } [DataMember] public int SiteId { get; set; } } And the RequestBase: [DataContract] public abstract class RequestBase : IRequest { #region IRequest Members [DataMember] public int PageSize { get; set; } [DataMember] public int PageIndex { get; set; } #endregion } And my IRequest interface public interface IRequest { int PageSize { get; set; } int PageIndex { get; set; } } And I have a wrapper class around my service calls. Here is the class. public class MyService : IMyService { IRDCService service; public MyService() { //service = new MySite.RDCService.RDCServiceClient(); EndpointAddress address = new EndpointAddress(APISettings.Default.ServiceUrl); BasicHttpBinding binding = new BasicHttpBinding(BasicHttpSecurityMode.None); binding.TransferMode = TransferMode.Streamed; binding.MaxBufferSize = 65536; binding.MaxReceivedMessageSize = 4194304; ChannelFactory<IRDCService> factory = new ChannelFactory<IRDCService>(binding, address); service = factory.CreateChannel(); } public Response<Customer> GetCustomer(CustomerRequest request) { return service.GetCustomer(request); } public Response<Customer> GetSiteCustomers(CustomerRequest request) { return service.GetSiteCustomers(request); } } and finally, the response object. [DataContract] public class Response<T> { [DataMember] public IEnumerable<T> Results { get; set; } [DataMember] public int TotalResults { get; set; } [DataMember] public int PageIndex { get; set; } [DataMember] public int PageSize { get; set; } [DataMember] public RulesException Exception { get; set; } } So, when I build my CustomerRequest object and pass it in, for some reason it's hitting the server as an empty request. Any ideas why? I've tried upping the object graph and the message size. When I debug it stops in the wrapper class with the 400 error. I'm not sure if there is a serialization error, but considering the object contract is 4 integer properties I can't imagine it causing an issue.

Read the article
CSS list menu; extra padding on rollover of buttons

- by user1669878

I have been going crazy trying to figure out why there is extra padding showing up on my navigation buttons when I rollover them. It's only showing up to the left and right of them though. Here's a link to the screenshot of what it looks like: http://i179.photobucket.com/albums/w319/jdauel/Screenshot2012-09-13at25417PM.png I think it has something to do with my CSS but I have no idea anymore. Please help me??? I tried using Firebug to figure it out with no prevail. Here's the code: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> <title>Farren's Photography</title> <style type="text/css"> html { height: 100%; width: 100%; } body { margin: 0px; } #container { font-family: Georgia, "Times New Roman", Times, serif; font-size: 1.2em; color: #000; background-color: #06F; text-align: left; padding: 0px; height: 650px; width: 960px; margin-right: auto; margin-left: auto; background-image: url(images/background_image.png); background-repeat: no-repeat; margin-top: 45px; } a:link { color: #FFF; } a:visited { color: #FFF; } a:hover { color: #FFF; } #container #logo { } #container #logo #fp-logo { background-image: url(images/logo.png); height: 137px; width: 408px; text-indent: -9999px; display: block; } #logo { height: 137px; width: 408px; position: relative; padding-top: 35px; padding-right: 0px; padding-bottom: 0px; padding-left: 35px; } #main { background-color: #FFF; min-height: 383px; width: 707px; position: relative; left: 217px; top: 16px; right: 36px; bottom: 113px; } #container #navbar { font-family: Georgia, "Times New Roman", Times, serif; font-size: 14px; color: #FFF; text-align: right; height: 45px; background-color: #CC0000; position: relative; top: 8px; bottom: 0px; left: 0px; right: 0px; } #container #navbar ul li a { text-decoration: none; } #container #navbar ul { list-style-type: none; padding-top: 16px; } #container #navbar ul li { display: inline; background-color: #280803; margin: 0px; height: 0px; width: 0px; position: relative; padding-top: 16px; padding-right: 15px; padding-bottom: 17px; padding-left: 15px; } #container #navbar ul li a:link { text-decoration: none; color: #FFF; } #container #navbar ul li a:visited { text-decoration: none; color: #FFF; } #container #navbar ul li a:hover { text-decoration: none; color: #FFF; background-color: #027e8e; padding-top: 16px; padding-right: 15px; padding-bottom: 17px; padding-left: 15px; margin: 0px; } #footer { font-family: Arial, Helvetica, sans-serif; font-size: x-small; height: 28px; position: relative; top: 8px; color: #FFF; font-style: italic; } </style> </head> <body> <div id="container"> <div id="logo"><a href="http://www.farrensphotography.com" title="Farren's Photography" target="_self" id="fp-logo">Farren's Photography</a></div> <div id="main"> <div id="content"> </div> </div> <div id="navbar"> <ul> <li><a href="index.html" target="_self">Home</a></li> <li><a href="portfolio.html" target="_self">Portfolio</a></li> <li><a href="mystyle.html" target="_self">My Style</a></li> <li><a href="specials.html" target="_self">Specials</a></li> <li><a href="pricing.html" target="_self">Pricing</a></li> <li><a href="contact.html" target="_self">Contact</a></li> </ul> </div>  <div id="footer"> <div id="copyright">All images copyright© Farrens Photography </div> <div id="network">Facebook button </div> </div> </div> </body> </html>

Read the article
Unable to debug simple Java application in Eclipse. Cannot connect to VM. AGENT_ERROR_TRANSPORT_INIT(197)

- by Heptaparaparshi

When i try to debug a simple application in Eclipse i receive a following error: Cannot connect to VM com.sun.jdi.connect.TransportTimeoutException And console provides me with a lonely string: FATAL ERROR in native method: JDWP No transports initialized, jvmtiError=AGENT_ERROR_TRANSPORT_INIT(197) I have JRE 1.6, JRE 1.7 and JDK 1.7 installed. Tried all of them. I've seen tons of same topics, but not a single answer helped me to solve my issue. Here they are: 1) Disable Firewall. Doesn't help. I have latest Avast ver. 9.0 at the moment. I'm a bit suspicious about that software, because before updating my Avast i was able to debug in Eclipse. I think it may cause this error, but i do not have direct clues :). I may ping my machine, firewall doesn't block Eclipse traffic, etc. 2) Add strings to hosts file. No reaction. ::1 localhost.localdomain localhost 127.0.0.1 localhost 3) Changing Network Settings in Java Control Panel to "Direct" connection. Doubtful advice. Also read that thing: http://wiki.eclipse.org/Debug/FAQ Can anyone help me to find out what is happening? Or guide me in the right direction?

Read the article
Mutual SSL Client Authentication

- by nordisk

Hi, I'm trying to achieve mutual SSL client authentication but without much success so far. Let me explain my scenario first: I have a client certificate issued by an intermediate CA whose certificate in turn was issued by a root CA (the intermediate and root CAs are within the company's network). This is the certificate I am including as part of my call to the server (using the HttpWebRequest object). The server has imported my client certificate and it is one of the certificates presented to me. An important thing to note is that the server does not trust the intermediate CA or the root for that matter. What we're trying to achieve is authentication against the certificate directly, i.e. mutual authentication using my client certificate. The error I'm getting is: "The request was aborted: Could not create SSL/TLS secure channel." From my trace logs I also get the following: System.Net Information: 0 : [3380] SecureChannel#34868631 - We have user-provided certificates. The server has specified 2 issuer(s). Looking for certificates that match any of the issuers. System.Net Information: 0 : [3380] SecureChannel#34868631 - Left with 0 client certificates to choose from. One of the certificates presented to us from the server is the same as our client certificate but the matching between them seems to fail. It looks like it's trying to verify the issuer. Now to make things even more interesting: If the server trusts and sends back the intermediate CA then everything works fine! (This is not an option for the production environment though I'm told) Using jmeter to test the request works fine too. I can only assume that Java's SSL handshake implementation is somewhat different. So it really comes down to this: Do you need to implement mutual SSL authentication differently from normal client SSL authentication? Any ideas or comments would be greatly appreciated.

Read the article
perl Client-SSL-Warning: Peer certificate not verified

- by Jeremey

I am having trouble with a perl screenscraper to an HTTPS site. In debugging, I ran the following: print $res->headers_as_string; and in the output, I have the following line: Client-SSL-Warning: Peer certificate not verified Is there a way I can auto-accept this certificate, or is that not the problem? #!/usr/bin/perl use LWP::UserAgent; use Crypt::SSLeay::CTX; use Crypt::SSLeay::Conn; use Crypt::SSLeay::X509; use LWP::Simple qw(get); my $ua = LWP::UserAgent->new; my $req = HTTP::Request->new(GET => 'https://vzw-cat.sun4.lightsurf.net/vzwcampaignadmin/'); my $res = $ua->request($req); print $res->headers_as_string; output: Cache-Control: no-cache Connection: close Date: Tue, 01 Jun 2010 19:28:08 GMT Pragma: No-cache Server: Apache Content-Type: text/html Expires: Wed, 31 Dec 1969 16:00:00 PST Client-Date: Tue, 01 Jun 2010 19:28:09 GMT Client-Peer: 64.152.68.114:443 Client-Response-Num: 1 Client-SSL-Cert-Issuer: /O=VeriSign Trust Network/OU=VeriSign, Inc./OU=VeriSign International Server CA - Class 3/OU=www.verisign.com/CPS Incorp.by Ref. LIABILITY LTD.(c)97 VeriSign Client-SSL-Cert-Subject: /C=US/ST=Massachusetts/L=Boston/O=verizon wireless/OU=TERMS OF USE AT WWW.VERISIGN.COM/RPA (C)00/CN=PSMSADMIN.VZW.COM Client-SSL-Cipher: DHE-RSA-AES256-SHA Client-SSL-Warning: Peer certificate not verified Client-Transfer-Encoding: chunked Link: <css/vtext_style.css>; rel="stylesheet"; type="text/css" Set-Cookie: JSESSIONID=DE6C99EA2F3DD1D4DF31456B94F16C90.vz3; Path=/vzwcampaignadmin; Secure Title: Verizon Wireless - Campaign Administrator

Read the article
MSDeploy doesn't deploy to remote server using MSBuild and Visual Studio 2010

- by user317762

I'm currently running Visual Studio Team System 2010 RC and I'm trying to get the Build Service setup to build my solution and deploy 3 web applications in it. I've created a custom build configuration called Integration and I've setup the "IIS Web site/application name to use on the destination server" on the Package/Publish tab of the Properties for each of the web applications. In my Build Definition I've set the following arguments: /p:DeployOnBuild=True /p:DeployTarget=MSDeployPublish /p:MSDeployPublishMethod=InProc /p:MsDeployServiceUrl=http://my-server-name:8172/msdeploy.axd /p:EnablePackageProcessLoggingAndAssert=True However, when I run the build I get the following error, for all three web applications: Updating setAcl (RightContent). C:\Program Files\MSBuild\Microsoft\VisualStudio\v10.0\Web\Microsoft.Web.Publishing.targets(3481,5): error : Web deployment task failed. (Attempted to perform an unauthorized operation.) I don't think this is my actual problem though. This error is occuring after the following entry in the log: Updating setAcl This is what's causing the error message, but it appears that MSDeploy is trying to deploy to the local IIS on the Build server, not the server I specified with the MsDeployServiceUrl parameter. After looking at the targets file at C:\Program Files\MSBuild\Microsoft\VisualStudio\v10.0\Web\Microsoft.Web.Publishing.targets, I added the EnablePackageProcessLoggingAndAssert, which adds extra logging. The log shows an emptry string for the value of MsDeployServiceUrl. I also noticed in the target that MsDeployServiceUrl has a lowercase s, which is somewhat confusing because the task name MSDeployPublish has an uppercase S. I tried using it using uppercase, then again using lowercase, but neither worked. A couple other things to note: My build service is running as NETWORK SERVICE. The server I'm trying to deploy to is on another domain. I also tried adding /p:username=mydomain\myusername /p:password=mypassword to the MSBuild paramter list, but that didn't help. Does anyone know if I'm supplying the correct parameters? Or provide me with the correct ones? Thanks

Read the article
What good technology podcasts are out there?

- by Michael Stum

Yes, Podcasts, those nice little Audiobooks I can listen to on the way to work. With the current amount of Podcasts, it's like searching a needle in a haystack, except that the haystack happens to be the Internet and is filled with too many of these "Hot new Gadgets" stuff :( Now, even though I am mainly a .NET developer nowadays, maybe anyone knows some good Podcasts from people regarding the whole software lifecycle? Unit Testing, Continous Integration, Documentation, Deployment... So - what are you guys and gals listening to? Please note that the categorizations are somewhat subjective and may not be 100% accurate as many podcasts cover several areas. Categorization is made against what is considered the "main" area. General Software Engineering / Productivity Stack Overflow TekPub (Requires Paid Subscription) SE Radio 43 Folders Perspectives Dr. Dobb's (now a video feed) The Pragmatic Podcast (Inactive) IT Matters Agile Toolkit Podcast The Stack Trace (Inactive) Parleys Techzing The Startup Success Podcast Berkeley CS class lectures FOSS Weekly .NET / Visual Studio / Microsoft Herding Code Hanselminutes .NET Rocks! Deep Fried Bytes Alt.Net Podcast Polymorphic Podcast Sparkling Client (The Silverlight Podcast) dnrTV! Spaghetti Code ASP.NET Podcast Channel 9 Radio TFS PowerScripting Podcast The Thirsty Developer Elegant Code ConnectedShow Crafty Coders Coding QA jQuery yayQuery The official jQuery podcast Java / Groovy The Java Posse Grails Podcast Java Technology Insider Ruby / Rails Railscasts Rails Envy The Ruby on Rails Podcast Rubiverse Web Design / JavaScript / Ajax WebDevRadio Boagworld The Rissington podcast Ajaxian YUI Theater Unix / Linux / Mac / iPhone Mac Developer Network Hacker Public Radio Linux Outlaws Mac OS Ken LugRadio Linux radio show (Inactive) The Linux Action Show! Linux Kernel Mailing List (LKML) Summary Podcast Stanford's iPhone programming class SysAdmin, Security or Infrastructure RunAs Radio Security Now! Crypto-Gram Security Podcast Hak5 VMWare VMTN Windows Weekly PaulDotCom Security The Register - Semi-Coherent Computing FeatherCast General Tech / Business Tekzilla This Week in Tech The Guardian Tech Weekly PCMag Radio Podcast Entrepreneurship Corner Manager Tools Other / Misc. / Podcast Networks IT Conversations Retrobits Podcast No Agenda Netcast Cranky Geeks The Command Line Freelance Radio IBM developerWorks The Register - Open Season Drunk and Retired Technometria Sod This Radio4Nerds Hacker Medley

Read the article
Exception on SslStream.AuthenticateAsClient (The message was badly formatted)

- by Noms

I have got wierd problem going on. I am trying to connect to Apple server via TCP/SSL. I am using a Client certificate provided by Apple for push notifications. I installed the certificate on my server (Win2k3) in both Local Trusted Root certificates and Local Personal Certificates folder. Now I have a class library that deals with that connection, when i call this class library from a console application running from the server it works absolutely fine, but when i call that class library from an asp.net page or asmx web service I get the following exception. A call to SSPI failed, see inner exception. The message received was unexpected or badly formatted. This is my code: X509Certificate cert = new X509Certificate(certificateLocation, certificatePassword); X509CertificateCollection certCollection = new X509CertificateCollection(new X509Certificate[1] { cert }); // OPEN the new SSL Stream SslStream ssl = new SslStream(client.GetStream(), false, new RemoteCertificateValidationCallback(ValidateServerCertificate), null); ssl.AuthenticateAsClient(ipAddress, certCollection, SslProtocols.Default, false); ssl.AuthenticateAsClient is where the error gets thrown. This is driving me nuts. If the console application can connect fine, there must be some problem with asp.net network layer security that is failing the authentication... not sure, perhaps need to add something or some sort of security policy in the web.config. Also just to point out that i can connect fine on my local development machine both with console and website. Anyone has got any ideas?

Read the article
High Profile ASP.NET websites

- by nandos

About twice a month I get asked to justify the reason "Why are we using ASP.NET and not PHP or Java, or buzz-word-of-the-month-here, etc". 100% of the time the questions come from people that do not understand anything about technology. People that would not know the difference between FTP and HTTP. The best approach I found (so far) to justify it to people without getting into technical details is to just say "XXX website uses it". Which I get back "Oh...I did not know that, so ASP.NET must be good". I know, I know, it hurts. But it works. So, without getting into the merit of why I'm using ASP.NET (which could trigger an endless argument for other platforms), I'm trying to compile a list of high profile websites that are implemented in ASP.NET. (No, they would have no idea what StackOverflow is). Can you name a high-profile website implemented in ASP.NET? EDIT: Current list (thanks for all the responses): (trying to avoid tech sites and prioritizing retail sites) Costco - http://www.costco.com/ Crate & Barrel - http://www.crateandbarrel.com/ Home Shopping Network - http://www.hsn.com/ Buy.com - http://www.buy.com/ Dell - http://www.dell.com Nasdaq - http://www.nasdaq.com/ Virgin - http://www.virgin.com/ 7-Eleven - http://www.7-eleven.com/ Carnival Cruise Lines - http://www.carnival.com/ L'Oreal - http://www.loreal.com/ The White House - http://www.whitehouse.gov/ Remax - http://www.remax.com/ Monster Jobs - http://www.monster.com/ USA Today - http://www.usatoday.com/ ComputerJobs.com - http://computerjobs.com/ Match.com - http://www.match.com National Health Services (UK) - http://www.nhs.uk/ CarrerBuilder.com - http://www.careerbuilder.com/

Read the article
How do I deploy building blocks (quick parts) for Microsoft Outlook 2007?

- by now

I want to deploy some building blocks for Microsoft Outlook 2007. Microsoft has put up a poor solution at http://office.microsoft.com/en-us/outlook/HA102086531033.aspx#4 that asks you to save a template. That solution would require you to distribute that template to all the clients. An optimal solution would allow you to put the template containing the building blocks somewhere on the network and simply use the ”Workgroup building blocks path” group policy setting for shared paths in Microsoft Office 2007. Sadly, Outlook doesn’t respect that policy. Also, the described solution mentioned in the article above doesn’t work. Step 4 requests you to save the template as a Word Template after first asking you to save it as an Outlook Template. It seems that they copy&pasted the steps from the Word article and forgot to check whether it worked (and adjust the steps accordingly). Anyway, does anyone have any suggestions for how to distribute the building blocks without distributing NormalEmail.dotm (which will overwrite the clients’ own building blocks each time it is updated). Thanks!

Read the article
System.Net.WebClient doesn't work with Windows Authentication

- by Peter Hahndorf

I am trying to use System.Net.WebClient in a WinForms application to upload a file to an IIS6 server which has Windows Authentication as it only 'Authentication' method. WebClient myWebClient = new WebClient(); myWebClient.Credentials = new System.Net.NetworkCredential(@"boxname\peter", "mypassword"); byte[] responseArray = myWebClient.UploadFile("http://localhost/upload.aspx", fileName); I get a 'The remote server returned an error: (401) Unauthorized', actually it is a 401.2 Both client and IIS are on the same Windows Server 2003 Dev machine. When I try to open the page in Firefox and enter the same correct credentials as in the code, the page comes up. However when using IE8, I get the same 401.2 error. Tried Chrome and Opera and they both work. I have 'Enable Integrated Windows Authentication' enabled in the IE Internet options. The Security Event Log has a Failure Audit: Logon Failure: Reason: An error occurred during logon User Name: peter Domain: boxname Logon Type: 3 Logon Process: ÈùÄ Authentication Package: NTLM Workstation Name: boxname Status code: 0xC000006D Substatus code: 0x0 Caller User Name: - Caller Domain: - Caller Logon ID: - Caller Process ID: - Transited Services: - Source Network Address: 127.0.0.1 Source Port: 1476 I used Process Monitor and Fiddler to investigate but to no avail. Why would this work for 3rd party browsers but not with IE or System.Net.WebClient?

Read the article
PDF rendering crashes app Core Graphics

- by Felixyz

EDIT: The memory leaks turned out to be unrelated to the crashes. Leaks are fixed but crashes remain, still mysterious. My (iPhone) app does lots of PDF loading and rendering, some of it threaded. Sometime, it seems always after I flush a page cash after getting a memory warning, the app crashes with a bad access when trying to draw a pdf page stored in an NSData object. Here is one example trace: #0 0x3016d564 in CGPDFResourcesGetResource () #1 0x3016d58a in CGPDFResourcesGetResource () #2 0x3016d94e in CGPDFResourcesGetExtGState () #3 0x3015fac4 in CGPDFContentStreamGetExtGState () #4 0x301629a8 in op_gs () #5 0x3016df12 in handle_xname () #6 0x3016dd9e in read_objects () #7 0x3016de6c in CGPDFScannerScan () #8 0x30161e34 in CGPDFDrawingContextDraw () #9 0x3016a9dc in CGContextDrawPDFPage () But sometimes I get this instead: Program received signal: “EXC_BAD_ACCESS”. (gdb) bt #0 0x335625fa in objc_msgSend () #1 0x32c04eba in CFDictionaryGetValue () #2 0x3016d500 in get_value () #3 0x3016d5d6 in CGPDFResourcesGetFont () #4 0x3015fbb4 in CGPDFContentStreamGetFont () #5 0x30163480 in op_Tf () #6 0x3016df12 in handle_xname () #7 0x3016dd9e in read_objects () #8 0x3016de6c in CGPDFScannerScan () #9 0x30161e34 in CGPDFDrawingContextDraw () #10 0x3016a9dc in CGContextDrawPDFPage () Is this an indication that I've mistakenly deallocated an object? It's hard for me to decode what's happening here. This is how I create and retain the various objects involved: // Some data was just loaded from the network and is pointed to by "data" self.pdfData = data; _dataProviderRef = CGDataProviderCreateWithData( NULL, [_pdfData bytes], [_pdfData length], NULL ); _documentRef = CGPDFDocumentCreateWithProvider(_dataProviderRef); _pageRef = CGPDFDocumentGetPage(_documentRef, 1); CGPDFPageRetain(_pageRef); _pdfFrame = CGPDFPageGetBoxRect(_pageRef, kCGPDFArtBox); So the NSData object is retained, and I explicitly retain the page reference. The data provider and the document are already retained by the create-functions. And here is my dealloc method: -(void)dealloc { if (_pageRef) CGPDFPageRelease(_pageRef); if (_documentRef) CGPDFDocumentRelease(_documentRef); if (_dataProviderRef) CGDataProviderRelease(_dataProviderRef); self.pdfData = nil; [super dealloc]; } Am I doing anything wrong? Even an assurance that I'm not, with explanation, would be a help.

Read the article
Testing movie with Flash IDE fails to load file from localhost

- by davgothic

Hi, I'm just wondering if anybody can help me with my simple but frustrating problem. I have created an SWF that loads an XML file from http://localhost/flash/Projects/MEL/Quiz/Quiz/bin/xml/quiz.xml, but I get this error when running the movie using Test Movie in the Flash IDE. Error #2044: Unhandled ioError:. text=Error #2032: Stream Error. URL: http://localhost/flash/Projects/MEL/Quiz/Quiz/bin/xml/quiz.xml at Main/loadConfig()[D:\www\webroot\flash\Projects\MEL\Quiz\Quiz\src\Main.as:126] at Main/configLoadError()[D:\www\webroot\flash\Projects\MEL\Quiz\Quiz\src\Main.as:143] at flash.events::EventDispatcher/dispatchEventFunction() at flash.events::EventDispatcher/dispatchEvent() at flash.net::URLLoader/onComplete() The error I get if I handle the exception is: [IOErrorEvent type="ioError" bubbles=false cancelable=false eventPhase=2 text="Error #2032: Stream Error. URL: http://localhost/flash/Projects/MEL/Quiz/Quiz/bin/xml/quiz.xml"] Trouble is running the SWF in a browser locally does work, it only throws these errors in the Flash IDE. I have tried a adding wildcard crossdomain.xml file in my root web directory and setting the SWF publish properties for local playback security to Allow network only, but neither of these have solved my problem. I know Windows 7 handles localhost name resolution differently compared to previous versions of Windows but I have even added 127.0.0.1 localhost to my hosts file to no avail. Can anyone shed any light on this issue?

Read the article
How to pass SOAP headers into python SUDS that are not defined in WSDL file

- by chrissygormley

Hello, I have a camera on my network, I am trying to connect to it with suds but suds doesn't send all the information needed. I need to put extra soap headers not defined in the WSDL file so the camera can understand the message. All the headers are contained in a SOAP envelope and then the suds command be in the body of the message. I have checked the suds website and it says to pass in the headers like so: from suds.sax.element import Element client = client(url) ssnns = ('ssn', 'http://namespaces/sessionid') ssn = Element('SessionID', ns=ssnns).setText('123') client.set_options(soapheaders=ssn) result = client.service.addPerson(person) Now I am not sure how I would implement this, say for example I have the below header: <?xml version="1.0" encoding="UTF-8"?> <SOAP-ENV:Envelope xmlns:SOAP-ENV="http://www.w3.org/2003/05/soap-envelope" xmlns:SOAP ENC="http://www.w3.org/2003/05/soap-encoding" xmlns:p1="http://www.website.org/ver10/p/wsdl"> .<SOAP-ENV:Header> Using this or a similar example does anyone know hos I would get this passed into the soap command so my camera understands? Thanks

Read the article

Search Results

Search found 19308 results on 773 pages for 'network efficiency'.

Page 703/773 | < Previous Page | 699 700 701 702 703 704 705 706 707 708 709 710 | Next Page >

- by Spirit

- by JMRboosties

- by Will3265

- by JSteve

- by Paul White

- by Claus Jandausch

- by Ralf Westphal

- by James Michael Hare

- by FastAl

- by FastAl

- by user1325655

- by Josh

- by user1669878

- by Heptaparaparshi

- by nordisk

- by Jeremey

- by user317762

- by Michael Stum

- by Noms

- by nandos

- by now

- by Peter Hahndorf

- by Felixyz

- by davgothic

- by chrissygormley

< Previous Page | 699 700 701 702 703 704 705 706 707 708 709 710 | Next Page >