Search Results

Search found 337 results on 14 pages for 'hashing'.

Page 13/14 | < Previous Page | 9 10 11 12 13 14  | Next Page >

  • What guarantees are there on the run-time complexity (Big-O) of LINQ methods?

    - by tzaman
    I've recently started using LINQ quite a bit, and I haven't really seen any mention of run-time complexity for any of the LINQ methods. Obviously, there are many factors at play here, so let's restrict the discussion to the plain IEnumerable LINQ-to-Objects provider. Further, let's assume that any Func passed in as a selector / mutator / etc. is a cheap O(1) operation. It seems obvious that all the single-pass operations (Select, Where, Count, Take/Skip, Any/All, etc.) will be O(n), since they only need to walk the sequence once; although even this is subject to laziness. Things are murkier for the more complex operations; the set-like operators (Union, Distinct, Except, etc.) work using GetHashCode by default (afaik), so it seems reasonable to assume they're using a hash-table internally, making these operations O(n) as well, in general. What about the versions that use an IEqualityComparer? OrderBy would need a sort, so most likely we're looking at O(n log n). What if it's already sorted? How about if I say OrderBy().ThenBy() and provide the same key to both? I could see GroupBy (and Join) using either sorting, or hashing. Which is it? Contains would be O(n) on a List, but O(1) on a HashSet - does LINQ check the underlying container to see if it can speed things up? And the real question - so far, I've been taking it on faith that the operations are performant. However, can I bank on that? STL containers, for example, clearly specify the complexity of every operation. Are there any similar guarantees on LINQ performance in the .NET library specification?

    Read the article

  • CodePlex Daily Summary for Sunday, March 07, 2010

    CodePlex Daily Summary for Sunday, March 07, 2010New ProjectsAlgorithminator: Universal .NET algorithm visualizer, which helps you to illustrate any algorithm, written in any .NET language. Still in development.ALToolkit: Contains a set of handy .NET components/classes. Currently it contains: * A Numeric Text Box (an Extended NumericUpDown) * A Splash Screen base fo...Automaton Home: Automaton is a home automation software built with a n-Tier, MVVM pattern utilzing WCF, EF, WPF, Silverlight and XBAP.Developer Controls: Developer Controls contains various controls to help build applications that can script/write code.Dynamic Reference Manager: Dynamic Reference Manager is a set (more like a small group) of classes and attributes written in C# that allows any .NET program to reference othe...indiologic: Utilities of an IndioNeural Cryptography in F#: This project is my magistracy resulting work. It is intended to be an example of using neural networks in cryptography. Hashing functions are chose...Particle Filter Visualization: Particle Filter Visualization Program for the Intel Science and Engineering FairPólya: Efficient, immutable, polymorphic collections. .Net lacks them, we provide them*. * By we, we mean I; and by efficient, I mean hopefully so.project euler solutions from mhinze: mhinze project euler solutionsSilverlight 4 and WCF multi layer: Silverlight 4 and WCF multi layersqwarea: Project for a browser-based, minimalistic, massively multiplayer strategy game. Part of the "Génie logiciel et Cloud Computing" course of the ENS (...SuperSocket: SuperSocket, a socket application framework can build FTP/SMTP/POP server easilyToast (for ASP.NET MVC): Dynamic, developer & designer friendly content injection, compression and optimization for ASP.NET MVCNew ReleasesALToolkit: ALToolkit 1.0: Binary release of the libraries containing: NumericTextBox SplashScreen Based on the VB.NET code, but that doesn't really matter.Blacklist of Providers: 1.0-Milestone 1: Blacklist of Providers.Milestone 1In this development release implemented - Main interface (Work Item #5453) - Database (Work Item #5523)C# Linear Hash Table: Linear Hash Table b2: Now includes a default constructor, and will throw an exception if capacity is not set to a power of 2 or loadToMaintain is below 1.Composure: CassiniDev-Trunk-40745-VS2010.rc1.NET4: A simple port of the CassiniDev portable web server project for Visual Studio 2010 RC1 built against .NET 4.0. The WCF tests currently fail unless...Developer Controls: DevControls: These are the version 1.0 releases of these controls. Download the individually or all together (in a .zip file). More releases coming soon!Dynamic Reference Manager: DRM Alpha1: This is the first release. I'm calling it Alpha because I intend implementing other functions, but I do not intend changing the way current functio...ESB Toolkit Extensions: Tellago SOA ESB Extenstions v0.3: Windows Installer file that installs Library on a BizTalk ESB 2.0 system. This Install automatically configures the esb.config to use the new compo...GKO Libraries: GKO Libraries 0.1 Alpha: 0.1 AlphaHome Access Plus+: v3.0.3.0: Version 3.0.3.0 Release Change Log: Added Announcement Box Removed script files that aren't needed Fixed & issue in directory path Stylesheet...Icarus Scene Engine: Icarus Scene Engine 1.10.306.840: Icarus Professional, Icarus Player, the supporting software for Icarus Scene Engine, with some included samples, and the start of a tutorial (with ...mavjuz WndLpt: wndlpt-0.2.5: New: Response to 5 LPT inputs "test i 1" New: Reaction to 12 LPT outputs "test q 8" New: Reaction to all LPT pins "test pin 15" New: Syntax: ...Neural Cryptography in F#: Neural Cryptography 0.0.1: The most simple version of this project. It has a neural network that works just like logical AND and a possibility to recreate neural network from...Password Provider: 1.0.3: This release fixes a bug which caused the program to crash when double clicking on a generic item.RoTwee: RoTwee 6.2.0.0: New feature is as next. 16649 Add hashtag for tweet of tune.Now you can tweet your playing tune with hashtag.Visual Studio DSite: Picture Viewer (Visual C++ 2008): This example source code allows you to view any picture you want in the click of a button. All you got to do is click the button and browser via th...WatchersNET CKEditor™ Provider for DotNetNuke: CKEditor Provider 1.8.00: Whats New File Browser: Folders & Files View reworked File Browser: Folders & Files View reworked File Browser: Folders are displayed as TreeVi...WSDLGenerator: WSDLGenerator 0.0.0.4: - replaced CommonLibrary.dll by CommandLineParser.dll - added better support for custom complex typesMost Popular ProjectsMetaSharpSilverlight ToolkitASP.NET Ajax LibraryAll-In-One Code FrameworkWindows 7 USB/DVD Download Toolニコ生アラートWindows Double ExplorerVirtual Router - Wifi Hot Spot for Windows 7 / 2008 R2Caliburn: An Application Framework for WPF and SilverlightArkSwitchMost Active ProjectsUmbraco CMSRawrSDS: Scientific DataSet library and toolsBlogEngine.NETjQuery Library for SharePoint Web Servicespatterns & practices – Enterprise LibraryIonics Isapi Rewrite FilterFarseer Physics EngineFasterflect - A Fast and Simple Reflection APIFluent Assertions

    Read the article

  • NoSQL Java API for MySQL Cluster: Questions & Answers

    - by Mat Keep
    The MySQL Cluster engineering team recently ran a live webinar, available now on-demand demonstrating the ClusterJ and ClusterJPA NoSQL APIs for MySQL Cluster, and how these can be used in building real-time, high scale Java-based services that require continuous availability. Attendees asked a number of great questions during the webinar, and I thought it would be useful to share those here, so others are also able to learn more about the Java NoSQL APIs. First, a little bit about why we developed these APIs and why they are interesting to Java developers. ClusterJ and Cluster JPA ClusterJ is a Java interface to MySQL Cluster that provides either a static or dynamic domain object model, similar to the data model used by JDO, JPA, and Hibernate. A simple API gives users extremely high performance for common operations: insert, delete, update, and query. ClusterJPA works with ClusterJ to extend functionality, including - Persistent classes - Relationships - Joins in queries - Lazy loading - Table and index creation from object model By eliminating data transformations via SQL, users get lower data access latency and higher throughput. In addition, Java developers have a more natural programming method to directly manage their data, with a complete, feature-rich solution for Object/Relational Mapping. As a result, the development of Java applications is simplified with faster development cycles resulting in accelerated time to market for new services. MySQL Cluster offers multiple NoSQL APIs alongside Java: - Memcached for a persistent, high performance, write-scalable Key/Value store, - HTTP/REST via an Apache module - C++ via the NDB API for the lowest absolute latency. Developers can use SQL as well as NoSQL APIs for access to the same data set via multiple query patterns – from simple Primary Key lookups or inserts to complex cross-shard JOINs using Adaptive Query Localization Marrying NoSQL and SQL access to an ACID-compliant database offers developers a number of benefits. MySQL Cluster’s distributed, shared-nothing architecture with auto-sharding and real time performance makes it a great fit for workloads requiring high volume OLTP. Users also get the added flexibility of being able to run real-time analytics across the same OLTP data set for real-time business insight. OK – hopefully you now have a better idea of why ClusterJ and JPA are available. Now, for the Q&A. Q & A Q. Why would I use Connector/J vs. ClusterJ? A. Partly it's a question of whether you prefer to work with SQL (Connector/J) or objects (ClusterJ). Performance of ClusterJ will be better as there is no need to pass through the MySQL Server. A ClusterJ operation can only act on a single table (e.g. no joins) - ClusterJPA extends that capability Q. Can I mix different APIs (ie ClusterJ, Connector/J) in our application for different query types? A. Yes. You can mix and match all of the API types, SQL, JDBC, ODBC, ClusterJ, Memcached, REST, C++. They all access the exact same data in the data nodes. Update through one API and new data is instantly visible to all of the others. Q. How many TCP connections would a SessionFactory instance create for a cluster of 8 data nodes? A. SessionFactory has a connection to the mgmd (management node) but otherwise is just a vehicle to create Sessions. Without using connection pooling, a SessionFactory will have one connection open with each data node. Using optional connection pooling allows multiple connections from the SessionFactory to increase throughput. Q. Can you give details of how Cluster J optimizes sharding to enhance performance of distributed query processing? A. Each data node in a cluster runs a Transaction Coordinator (TC), which begins and ends the transaction, but also serves as a resource to operate on the result rows. While an API node (such as a ClusterJ process) can send queries to any TC/data node, there are performance gains if the TC is where most of the result data is stored. ClusterJ computes the shard (partition) key to choose the data node where the row resides as the TC. Q. What happens if we perform two primary key lookups within the same transaction? Are they sent to the data node in one transaction? A. ClusterJ will send identical PK lookups to the same data node. Q. How is distributed query processing handled by MySQL Cluster ? A. If the data is split between data nodes then all of the information will be transparently combined and passed back to the application. The session will connect to a data node - typically by hashing the primary key - which then interacts with its neighboring nodes to collect the data needed to fulfil the query. Q. Can I use Foreign Keys with MySQL Cluster A. Support for Foreign Keys is included in the MySQL Cluster 7.3 Early Access release Summary The NoSQL Java APIs are packaged with MySQL Cluster, available for download here so feel free to take them for a spin today! Key Resources MySQL Cluster on-line demo  MySQL ClusterJ and JPA On-demand webinar  MySQL ClusterJ and JPA documentation MySQL ClusterJ and JPA whitepaper and tutorial

    Read the article

  • Some non-generic collections

    - by Simon Cooper
    Although the collections classes introduced in .NET 2, 3.5 and 4 cover most scenarios, there are still some .NET 1 collections that don't have generic counterparts. In this post, I'll be examining what they do, why you might use them, and some things you'll need to bear in mind when doing so. BitArray System.Collections.BitArray is conceptually the same as a List<bool>, but whereas List<bool> stores each boolean in a single byte (as that's what the backing bool[] does), BitArray uses a single bit to store each value, and uses various bitmasks to access each bit individually. This means that BitArray is eight times smaller than a List<bool>. Furthermore, BitArray has some useful functions for bitmasks, like And, Xor and Not, and it's not limited to 32 or 64 bits; a BitArray can hold as many bits as you need. However, it's not all roses and kittens. There are some fundamental limitations you have to bear in mind when using BitArray: It's a non-generic collection. The enumerator returns object (a boxed boolean), rather than an unboxed bool. This means that if you do this: foreach (bool b in bitArray) { ... } Every single boolean value will be boxed, then unboxed. And if you do this: foreach (var b in bitArray) { ... } you'll have to manually unbox b on every iteration, as it'll come out of the enumerator an object. Instead, you should manually iterate over the collection using a for loop: for (int i=0; i<bitArray.Length; i++) { bool b = bitArray[i]; ... } Following on from that, if you want to use BitArray in the context of an IEnumerable<bool>, ICollection<bool> or IList<bool>, you'll need to write a wrapper class, or use the Enumerable.Cast<bool> extension method (although Cast would box and unbox every value you get out of it). There is no Add or Remove method. You specify the number of bits you need in the constructor, and that's what you get. You can change the length yourself using the Length property setter though. It doesn't implement IList. Although not really important if you're writing a generic wrapper around it, it is something to bear in mind if you're using it with pre-generic code. However, if you use BitArray carefully, it can provide significant gains over a List<bool> for functionality and efficiency of space. OrderedDictionary System.Collections.Specialized.OrderedDictionary does exactly what you would expect - it's an IDictionary that maintains items in the order they are added. It does this by storing key/value pairs in a Hashtable (to get O(1) key lookup) and an ArrayList (to maintain the order). You can access values by key or index, and insert or remove items at a particular index. The enumerator returns items in index order. However, the Keys and Values properties return ICollection, not IList, as you might expect; CopyTo doesn't maintain the same ordering, as it copies from the backing Hashtable, not ArrayList; and any operations that insert or remove items from the middle of the collection are O(n), just like a normal list. In short; don't use this class. If you need some sort of ordered dictionary, it would be better to write your own generic dictionary combining a Dictionary<TKey, TValue> and List<KeyValuePair<TKey, TValue>> or List<TKey> for your specific situation. ListDictionary and HybridDictionary To look at why you might want to use ListDictionary or HybridDictionary, we need to examine the performance of these dictionaries compared to Hashtable and Dictionary<object, object>. For this test, I added n items to each collection, then randomly accessed n/2 items: So, what's going on here? Well, ListDictionary is implemented as a linked list of key/value pairs; all operations on the dictionary require an O(n) search through the list. However, for small n, the constant factor that big-o notation doesn't measure is much lower than the hashing overhead of Hashtable or Dictionary. HybridDictionary combines a Hashtable and ListDictionary; for small n, it uses a backing ListDictionary, but switches to a Hashtable when it gets to 9 items (you can see the point it switches from a ListDictionary to Hashtable in the graph). Apart from that, it's got very similar performance to Hashtable. So why would you want to use either of these? In short, you wouldn't. Any gain in performance by using ListDictionary over Dictionary<TKey, TValue> would be offset by the generic dictionary not having to cast or box the items you store, something the graphs above don't measure. Only if the performance of the dictionary is vital, the dictionary will hold less than 30 items, and you don't need type safety, would you use ListDictionary over the generic Dictionary. And even then, there's probably more useful performance gains you can make elsewhere.

    Read the article

  • Best way to store a large amount of game objects and update the ones onscreen

    - by user3002473
    Good afternoon guys! I'm a young beginner game developer working on my first large scale game project and I've run into a situation where I'm not quite sure what the best solution may be (if there is a lone solution). The question may be vague (if anyone can think of a better title after having read the question, please edit it) or broad but I'm not quite sure what to do and I thought it would help just to discuss the problem with people more educated in the field. Before we get started, here are some of the questions I've looked at for help in the past: Best way to keep track of game objects Elegant way to simulate large amounts of entities within a game world What is the most efficient container to store dynamic game objects in? I've also read articles about different data structures commonly used in games to store game objects such as this one about slot maps, but none of them are really what I'm looking for. Also, if it helps at all I'm using Python 3 to design the game. It has to be Python 3, if I could I would use C++ or Unityscript or something else, but I'm restricted to having to use Python 3. My game will be a form of side scroller shooter game. In said game the player will traverse large rooms with large amounts of enemies and other game objects to update (think some of the larger areas in Cave Story or Iji). The player obviously can't see the entire room all at once, so there is a viewport that follows the player around and renders only a selection of the room and the game objects that it contains. This is not a foreign concept. The part that's getting me confused has to do with how certain game objects are updated. Some of them are to be updated constantly, regardless of whether or not they can be seen. Other objects however are only to be updated when they are onscreen (for example, an enemy would only be updated to react to the player when it is onscreen or when it is in a certain range of the screen). Another problem is that game objects have to be easily referable by other game objects; something that happens in the player's update() method may affect another object in the world. Collision detection in games is always a serious problem. I need a way of containing the game objects such that it minimizes the number of cases when testing for collisions against one another. The final problem is that of creating and destroying game objects. I think this problem is pretty self explanatory. To store the game objects then I've considered a number of different methods. The original method I had was to simply store all the objects in a hash table by an id. This method was simple, and decently fast as it allows all the objects to be looked up in O(1) complexity, and also allows them to be deleted fairly easily. Hash collisions would not be a major problem; I wasn't originally planning on using computer generated ids to store the game objects I was going to rely on them all using ids given to them by the game designer (such names would be strings like 'Player' or 'EnemyWeapon4'), and even if I did use computer generated ids, if I used a decent hashing algorithm then the chances of collisions would be around 1 in 4 billion. The problem with using a hash table however is that it is inefficient in checking to see what objects are in range of the viewport. Considering the fact that certain game objects move (as well as the viewport itself), the only solution I could think of in order to only update objects that are in the viewport would be to iterate through every object in the hash table and check if it is in the viewport or not, updating only the ones that are in the valid area. This would be incredibly slow in scenarios where the amount of game objects exceeds 500, or even 200. The second solution was to store everything in a 2-d list. The world is partitioned up into cells (a tilemap essentially), where each cell or tile is the same size and is square. Each cell would contain a list of the game objects that are currently occupying it (each game object would be inserted into a cell depending on the center of the object's collision mask). A 2-d list would allow me to take the top-left and bottom-right corners of the viewport and easily grab a rectangular area of the grid containing only the cells containing entities that are in valid range to be updated. This method also solves the problem of collision detection; when I take an entity I can find the cell that it is currently in, then check only against entities in it's cell and the 8 cells around it. One problem with this system however is that it prohibits easy lookup of game objects. One solution I had would be to simultaneously keep a hash table that would contain all the positions of the objects in the 2-d list indexed by the id of said object. The major problem with a 2-d list is that it would need to be rebuilt every single game frame (along with the hash table of object positions), which may be a serious detriment to game speed. Both systems have ups and downs and seem to solve some of each other's problems, however using them both together doesn't seem like the best solution either. If anyone has any thoughts, ideas, suggestions, comments, opinions or solutions on new data structures or better implementations of the existing data structures I have in mind, please post, any and all criticism and help is welcome. Thanks in advance! EDIT: Please don't close the question because it has a bad title, I'm just bad with names!

    Read the article

  • Developing Schema Compare for Oracle (Part 6): 9i Query Performance

    - by Simon Cooper
    All throughout the EAP and beta versions of Schema Compare for Oracle, our main request was support for Oracle 9i. After releasing version 1.0 with support for 10g and 11g, our next step was then to get version 1.1 of SCfO out with support for 9i. However, there were some significant problems that we had to overcome first. This post will concentrate on query execution time. When we first tested SCfO on a 9i server, after accounting for various changes to the data dictionary, we found that database registration was taking a long time. And I mean a looooooong time. The same database that on 10g or 11g would take a couple of minutes to register would be taking upwards of 30 mins on 9i. Obviously, this is not ideal, so a poke around the query execution plans was required. As an example, let's take the table population query - the one that reads ALL_TABLES and joins it with a few other dictionary views to get us back our list of tables. On 10g, this query takes 5.6 seconds. On 9i, it takes 89.47 seconds. The difference in execution plan is even more dramatic - here's the (edited) execution plan on 10g: -------------------------------------------------------------------------------| Id | Operation | Name | Bytes | Cost |-------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 108K| 939 || 1 | SORT ORDER BY | | 108K| 939 || 2 | NESTED LOOPS OUTER | | 108K| 938 ||* 3 | HASH JOIN RIGHT OUTER | | 103K| 762 || 4 | VIEW | ALL_EXTERNAL_LOCATIONS | 2058 | 3 ||* 20 | HASH JOIN RIGHT OUTER | | 73472 | 759 || 21 | VIEW | ALL_EXTERNAL_TABLES | 2097 | 3 ||* 34 | HASH JOIN RIGHT OUTER | | 39920 | 755 || 35 | VIEW | ALL_MVIEWS | 51 | 7 || 58 | NESTED LOOPS OUTER | | 39104 | 748 || 59 | VIEW | ALL_TABLES | 6704 | 668 || 89 | VIEW PUSHED PREDICATE | ALL_TAB_COMMENTS | 2025 | 5 || 106 | VIEW | ALL_PART_TABLES | 277 | 11 |------------------------------------------------------------------------------- And the same query on 9i: -------------------------------------------------------------------------------| Id | Operation | Name | Bytes | Cost |-------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 16P| 55G|| 1 | SORT ORDER BY | | 16P| 55G|| 2 | NESTED LOOPS OUTER | | 16P| 862M|| 3 | NESTED LOOPS OUTER | | 5251G| 992K|| 4 | NESTED LOOPS OUTER | | 4243M| 2578 || 5 | NESTED LOOPS OUTER | | 2669K| 1440 ||* 6 | HASH JOIN OUTER | | 398K| 302 || 7 | VIEW | ALL_TABLES | 342K| 276 || 29 | VIEW | ALL_MVIEWS | 51 | 20 ||* 50 | VIEW PUSHED PREDICATE | ALL_TAB_COMMENTS | 2043 | ||* 66 | VIEW PUSHED PREDICATE | ALL_EXTERNAL_TABLES | 1777K| ||* 80 | VIEW PUSHED PREDICATE | ALL_EXTERNAL_LOCATIONS | 1744K| ||* 96 | VIEW | ALL_PART_TABLES | 852K| |------------------------------------------------------------------------------- Have a look at the cost column. 10g's overall query cost is 939, and 9i is 55,000,000,000 (or more precisely, 55,496,472,769). It's also having to process far more data. What on earth could be causing this huge difference in query cost? After trawling through the '10g New Features' documentation, we found item 1.9.2.21. Before 10g, Oracle advised that you do not collect statistics on data dictionary objects. From 10g, it advised that you do collect statistics on the data dictionary; for our queries, Oracle therefore knows what sort of data is in the dictionary tables, and so can generate an efficient execution plan. On 9i, no statistics are present on the system tables, so Oracle has to use the Rule Based Optimizer, which turns most LEFT JOINs into nested loops. If we force 9i to use hash joins, like 10g, we get a much better plan: -------------------------------------------------------------------------------| Id | Operation | Name | Bytes | Cost |-------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 7587K| 3704 || 1 | SORT ORDER BY | | 7587K| 3704 ||* 2 | HASH JOIN OUTER | | 7587K| 822 ||* 3 | HASH JOIN OUTER | | 5262K| 616 ||* 4 | HASH JOIN OUTER | | 2980K| 465 ||* 5 | HASH JOIN OUTER | | 710K| 432 ||* 6 | HASH JOIN OUTER | | 398K| 302 || 7 | VIEW | ALL_TABLES | 342K| 276 || 29 | VIEW | ALL_MVIEWS | 51 | 20 || 50 | VIEW | ALL_PART_TABLES | 852K| 104 || 78 | VIEW | ALL_TAB_COMMENTS | 2043 | 14 || 93 | VIEW | ALL_EXTERNAL_LOCATIONS | 1744K| 31 || 106 | VIEW | ALL_EXTERNAL_TABLES | 1777K| 28 |------------------------------------------------------------------------------- That's much more like it. This drops the execution time down to 24 seconds. Not as good as 10g, but still an improvement. There are still several problems with this, however. 10g introduced a new join method - a right outer hash join (used in the first execution plan). The 9i query optimizer doesn't have this option available, so forcing a hash join means it has to hash the ALL_TABLES table, and furthermore re-hash it for every hash join in the execution plan; this could be thousands and thousands of rows. And although forcing hash joins somewhat alleviates this problem on our test systems, there's no guarantee that this will improve the execution time on customers' systems; it may even increase the time it takes (say, if all their tables are partitioned, or they've got a lot of materialized views). Ideally, we would want a solution that provides a speedup whatever the input. To try and get some ideas, we asked some oracle performance specialists to see if they had any ideas or tips. Their recommendation was to add a hidden hook into the product that allowed users to specify their own query hints, or even rewrite the queries entirely. However, we would prefer not to take that approach; as well as a lot of new infrastructure & a rewrite of the population code, it would have meant that any users of 9i would have to spend some time optimizing it to get it working on their system before they could use the product. Another approach was needed. All our population queries have a very specific pattern - a base table provides most of the information we need (ALL_TABLES for tables, or ALL_TAB_COLS for columns) and we do a left join to extra subsidiary tables that fill in gaps (for instance, ALL_PART_TABLES for partition information). All the left joins use the same set of columns to join on (typically the object owner & name), so we could re-use the hash information for each join, rather than re-hashing the same columns for every join. To allow us to do this, along with various other performance improvements that could be done for the specific query pattern we were using, we read all the tables individually and do a hash join on the client. Fortunately, this 'pure' algorithmic problem is the kind that can be very well optimized for expected real-world situations; as well as storing row data we're not using in the hash key on disk, we use very specific memory-efficient data structures to store all the information we need. This allows us to achieve a database population time that is as fast as on 10g, and even (in some situations) slightly faster, and a memory overhead of roughly 150 bytes per row of data in the result set (for schemas with 10,000 tables in that means an extra 1.4MB memory being used during population). Next: fun with the 9i dictionary views.

    Read the article

  • Moving from Linear Probing to Quadratic Probing (hash collisons)

    - by Nazgulled
    Hi, My current implementation of an Hash Table is using Linear Probing and now I want to move to Quadratic Probing (and later to chaining and maybe double hashing too). I've read a few articles, tutorials, wikipedia, etc... But I still don't know exactly what I should do. Linear Probing, basically, has a step of 1 and that's easy to do. When searching, inserting or removing an element from the Hash Table, I need to calculate an hash and for that I do this: index = hash_function(key) % table_size; Then, while searching, inserting or removing I loop through the table until I find a free bucket, like this: do { if(/* CHECK IF IT'S THE ELEMENT WE WANT */) { // FOUND ELEMENT return; } else { index = (index + 1) % table_size; } while(/* LOOP UNTIL IT'S NECESSARY */); As for Quadratic Probing, I think what I need to do is change how the "index" step size is calculated but that's what I don't understand how I should do it. I've seen various pieces of code, and all of them are somewhat different. Also, I've seen some implementations of Quadratic Probing where the hash function is changed to accommodated that (but not all of them). Is that change really needed or can I avoid modifying the hash function and still use Quadratic Probing? EDIT: After reading everything pointed out by Eli Bendersky below I think I got the general idea. Here's part of the code at http://eternallyconfuzzled.com/tuts/datastructures/jsw_tut_hashtable.aspx: 15 for ( step = 1; table->table[h] != EMPTY; step++ ) { 16 if ( compare ( key, table->table[h] ) == 0 ) 17 return 1; 18 19 /* Move forward by quadratically, wrap if necessary */ 20 h = ( h + ( step * step - step ) / 2 ) % table->size; 21 } There's 2 things I don't get... They say that quadratic probing is usually done using c(i)=i^2. However, in the code above, it's doing something more like c(i)=(i^2-i)/2 I was ready to implement this on my code but I would simply do: index = (index + (index^index)) % table_size; ...and not: index = (index + (index^index - index)/2) % table_size; If anything, I would do: index = (index + (index^index)/2) % table_size; ...cause I've seen other code examples diving by two. Although I don't understand why... 1) Why is it subtracting the step? 2) Why is it diving it by 2?

    Read the article

  • Unable to verify body hash for DKIM

    - by Joshua
    I'm writing a C# DKIM validator and have come across a problem that I cannot solve. Right now I am working on calculating the body hash, as described in Section 3.7 Computing the Message Hashes. I am working with emails that I have dumped using a modified version of EdgeTransportAsyncLogging sample in the Exchange 2010 Transport Agent SDK. Instead of converting the emails when saving, it just opens a file based on the MessageID and dumps the raw data to disk. I am able to successfully compute the body hash of the sample email provided in Section A.2 using the following code: SHA256Managed hasher = new SHA256Managed(); ASCIIEncoding asciiEncoding = new ASCIIEncoding(); string rawFullMessage = File.ReadAllText(@"C:\Repositories\Sample-A.2.txt"); string headerDelimiter = "\r\n\r\n"; int headerEnd = rawFullMessage.IndexOf(headerDelimiter); string header = rawFullMessage.Substring(0, headerEnd); string body = rawFullMessage.Substring(headerEnd + headerDelimiter.Length); byte[] bodyBytes = asciiEncoding.GetBytes(body); byte[] bodyHash = hasher.ComputeHash(bodyBytes); string bodyBase64 = Convert.ToBase64String(bodyHash); string expectedBase64 = "2jUSOH9NhtVGCQWNr9BrIAPreKQjO6Sn7XIkfJVOzv8="; Console.WriteLine("Expected hash: {1}{0}Computed hash: {2}{0}Are equal: {3}", Environment.NewLine, expectedBase64, bodyBase64, expectedBase64 == bodyBase64); The output from the above code is: Expected hash: 2jUSOH9NhtVGCQWNr9BrIAPreKQjO6Sn7XIkfJVOzv8= Computed hash: 2jUSOH9NhtVGCQWNr9BrIAPreKQjO6Sn7XIkfJVOzv8= Are equal: True Now, most emails come across with the c=relaxed/relaxed setting, which requires you to do some work on the body and header before hashing and verifying. And while I was working on it (failing to get it to work) I finally came across a message with c=simple/simple which means that you process the whole body as is minus any empty CRLF at the end of the body. (Really, the rules for Body Canonicalization are quite ... simple.) Here is the real DKIM email with a signature using the simple algorithm (with only unneeded headers cleaned up). Now, using the above code and updating the expectedBase64 hash I get the following results: Expected hash: VnGg12/s7xH3BraeN5LiiN+I2Ul/db5/jZYYgt4wEIw= Computed hash: ISNNtgnFZxmW6iuey/3Qql5u6nflKPTke4sMXWMxNUw= Are equal: False The expected hash is the value from the bh= field of the DKIM-Signature header. Now, the file used in the second test is a direct raw output from the Exchange 2010 Transport Agent. If so inclined, you can view the modified EdgeTransportLogging.txt. At this point, no matter how I modify the second email, changing the start position or number of CRLF at the end of the file I cannot get the files to match. What worries me is that I have been unable to validate any body hash so far (simple or relaxed) and that it may not be feasible to process DKIM through Exchange 2010.

    Read the article

  • Codeigniter or PHP Amazon API help

    - by faya
    Hello, I have a problem searching through amazon web servise using PHP in my CodeIgniter. I get InvalidParameter timestamp is not in ISO-8601 format response from the server. But I don't think that timestamp is the problem,because I have tryed to compare with given date format from http://associates-amazon.s3.amazonaws.com/signed-requests/helper/index.html and it seems its fine. Could anyone help? Here is my code: $private_key = 'XXXXXXXXXXXXXXXX'; // Took out real secret key $method = "GET"; $host = "ecs.amazonaws.com"; $uri = "/onca/xml"; $timeStamp = gmdate("Y-m-d\TH:i:s.000\Z"); $timeStamp = str_replace(":", "%3A", $timeStamp); $params["AWSAccesskeyId"] = "XXXXXXXXXXXX"; // Took out real access key $params["ItemPage"] = $item_page; $params["Keywords"] = $keywords; $params["ResponseGroup"] = "Medium2%2525COffers"; $params["SearchIndex"] = "Books"; $params["Operation"] = "ItemSearch"; $params["Service"] = "AWSECommerceService"; $params["Timestamp"] = $timeStamp; $params["Version"] = "2009-03-31"; ksort($params); $canonicalized_query = array(); foreach ($params as $param=>$value) { $param = str_replace("%7E", "~", rawurlencode($param)); $value = str_replace("%7E", "~", rawurlencode($value)); $canonicalized_query[] = $param. "=". $value; } $canonicalized_query = implode("&", $canonicalized_query); $string_to_sign = $method."\n\r".$host."\n\r".$uri."\n\r".$canonicalized_query; $signature = base64_encode(hash_hmac("sha256",$string_to_sign, $private_key, True)); $signature = str_replace("%7E", "~", rawurlencode($signature)); $request = "http://".$host.$uri."?".$canonicalized_query."&Signature=".$signature; $response = @file_get_contents($request); if ($response === False) { return "response fail"; } else { $parsed_xml = simplexml_load_string($response); if ($parsed_xml === False) { return "parse fail"; } else { return $parsed_xml; } } P.S. - Personally I think that something is wrong in the generation of the from the $string_to_sign when hashing it.

    Read the article

  • Recursive N-way merge/diff algorithm for directory trees?

    - by BobMcGee
    What algorithms or Java libraries are available to do N-way, recursive diff/merge of directories? I need to be able to generate a list of folder trees that have many identical files, and have subdirectories with many similar files. I want to be able to use 2-way merge operations to quickly remove as much redundancy as possible. Goals: Find pairs of directories that have many similar files between them. Generate short list of directory pairs that can be synchronized with 2-way merge to eliminate duplicates Should operate recursively (there may be nested duplicates of higher-level directories) Run time and storage should be O(n log n) in numbers of directories and files Should be able to use an embedded DB or page to disk for processing more files than fit in memory (100,000+). Optional: generate an ancestry and change-set between folders Optional: sort the merge operations by how many duplicates they can elliminate I know how to use hashes to find duplicate files in roughly O(n) space, but I'm at a loss for how to go from this to finding partially overlapping sets between folders and their children. EDIT: some clarification The tricky part is the difference between "exact same" contents (otherwise hashing file hashes would work) and "similar" (which will not). Basically, I want to feed this algorithm at a set of directories and have it return a set of 2-way merge operations I can perform in order to reduce duplicates as much as possible with as few conflicts possible. It's effectively constructing an ancestry tree showing which folders are derived from each other. The end goal is to let me incorporate a bunch of different folders into one common tree. For example, I may have a folder holding programming projects, and then copy some of its contents to another computer to work on it. Then I might back up and intermediate version to flash drive. Except I may have 8 or 10 different versions, with slightly different organizational structures or folder names. I need to be able to merge them one step at a time, so I can chose how to incorporate changes at each step of the way. This is actually more or less what I intend to do with my utility (bring together a bunch of scattered backups from different points in time). I figure if I can do it right I may as well release it as a small open source util. I think the same tricks might be useful for comparing XML trees though.

    Read the article

  • How does the rsync algorithm correctly identify repeating blocks?

    - by Kai
    I'm on a personal quest to learn how the rsync algorithm works. After some reading and thinking, I've come up with a situation where I think the algorithm fails. I'm trying to figure out how this is resolved in an actual implementation. Consider this example, where A is the receiver and B is the sender. A = abcde1234512345fghij B = abcde12345fghij As you can see, the only change is that 12345 has been removed. Now, to make this example interesting, let's choose a block size of 5 bytes (chars). Hashing the values on the sender's side using the weak checksum gives the following values list. abcde|12345|fghij abcde -> 495 12345 -> 255 fghij -> 520 values = [495, 255, 520] Next we check to see if any hash values differ in A. If there's a matching block we can skip to the end of that block for the next check. If there's a non-matching block then we've found a difference. I'll step through this process. Hash the first block. Does this hash exist in the values list? abcde -> 495 (yes, so skip) Hash the second block. Does this hash exist in the values list? 12345 -> 255 (yes, so skip) Hash the third block. Does this hash exist in the values list? 12345 -> 255 (yes, so skip) Hash the fourth block. Does this hash exist in the values list? fghij -> 520 (yes, so skip) No more data, we're done. Since every hash was found in the values list, we conclude that A and B are the same. Which, in my humble opinion, isn't true. It seems to me this will happen whenever there is more than one block that share the same hash. What am I missing?

    Read the article

  • Instance caching in Objective C

    - by zoul
    Hello! I want to cache the instances of a certain class. The class keeps a dictionary of all its instances and when somebody requests a new instance, the class tries to satisfy the request from the cache first. There is a small problem with memory management though: The dictionary cache retains the inserted objects, so that they never get deallocated. I do want them to get deallocated, so that I had to overload the release method and when the retain count drops to one, I can remove the instance from cache and let it get deallocated. This works, but I am not comfortable mucking around the release method and find the solution overly complicated. I thought I could use some hashing class that does not retain the objects it stores. Is there such? The idea is that when the last user of a certain instance releases it, the instance would automatically disappear from the cache. NSHashTable seems to be what I am looking for, but the documentation talks about “supporting weak relationships in a garbage-collected environment.” Does it also work without garbage collection? Clarification: I cannot afford to keep the instances in memory unless somebody really needs them, that is why I want to purge the instance from the cache when the last “real” user releases it. Better solution: This was on the iPhone, I wanted to cache some textures and on the other hand I wanted to free them from memory as soon as the last real holder released them. The easier way to code this is through another class (let’s call it TextureManager). This class manages the texture instances and caches them, so that subsequent calls for texture with the same name are served from the cache. There is no need to purge the cache immediately as the last user releases the texture. We can simply keep the texture cached in memory and when the device gets short on memory, we receive the low memory warning and can purge the cache. This is a better solution, because the caching stuff does not pollute the Texture class, we do not have to mess with release and there is even a higher chance for cache hits. The TextureManager can be abstracted into a ResourceManager, so that it can cache other data, not only textures.

    Read the article

  • Delphi - Read File To StringList, then delete and write back to file.

    - by Jkraw90
    I'm currently working on a program to generate the hashes of files, in Delphi 2010. As part of this I have a option to create User Presets, e.g. pre-defined choice of hashing algo's which the user can create/save/delete. I have the create and load code working fine. It uses a ComboBox and loads from a file "fhpre.ini", inside this file is the users presets stored in format of:- PresetName PresetCode (a 12 digit string using 0 for don't hash and 1 for do) On application loading it loads the data from this file into the ComboBox and an Array with the ItemIndex of ComboBox matching the corrisponding correct string of 0's and 1's in the Array. Now I need to implement a feature to have the user delete a preset from the list. So far my code is as follows, procedure TForm1.Panel23Click(Sender : TObject); var fil : textfile; contents : TStringList; x,i : integer; filline : ansistring; filestream : TFileStream; begin //Start Procedure //Load data into StringList contents := TStringList.Create; fileStream := TFileStream.Create((GetAppData+'\RFA\fhpre.ini'), fmShareDenyNone); Contents.LoadFromStream(fileStream); fileStream.Destroy(); //Search for relevant Preset i := 0; if ComboBox4.Text <> Contents[i] then begin Repeat i := i + 1; Until ComboBox4.Text = Contents[i]; end; contents.Delete(i); //Delete Relevant Preset Name contents.Delete(i); //Delete Preset Digit String //Write StringList back to file. AssignFile(fil,(GetAppData+'\RFA\fhpre.ini')); ReWrite(fil); for i := 0 to Contents.Count -1 do WriteLn(Contents[i]); CloseFile(fil); Contents.Free; end; However if this is run, I get a 105 error when it gets to the WriteLn section. I'm aware that the code isn't great, for example doesn't have checks for presets with same name, but that will come, I want to get the base code working first then can tweak and add extra checks etc. Any help would be appreciated.

    Read the article

  • Security review of an authenticated Diffie Hellman variant

    - by mtraut
    EDIT I'm still hoping for some advice on this, i tried to clarify my intentions... When i came upon device pairing in my mobile communication framework i studied a lot of papers on this topic and and also got some input from previous questions here. But, i didn't find a ready to implement protocol solution - so i invented a derivate and as i'm no crypto geek i'm not sure about the security caveats of the final solution: The main questions are Is SHA256 sufficient as a commit function? Is the addition of the shared secret as an authentication info in the commit string safe? What is the overall security of the 1024 bit group DH I assume at most 2^-24 bit probability of succesful MITM attack (because of 24 bit challenge). Is this plausible? What may be the most promising attack (besides ripping the device out off my numb, cold hands) This is the algorithm sketch For first time pairing, a solution proposed in "Key agreement in peer-to-peer wireless networks" (DH-SC) is implemented. I based it on a commitment derived from: A fix "UUID" for the communicating entity/role (128 bit, sent at protocol start, before commitment) The public DH key (192 bit private key, based on the 1024 bit Oakley group) A 24 bit random challenge Commit is computed using SHA256 c = sha256( UUID || DH pub || Chall) Both parties exchange this commitment, open and transfer the plain content of the above values. The 24 bit random is displayed to the user for manual authentication DH session key (128 bytes, see above) is computed When the user opts for persistent pairing, the session key is stored with the remote UUID as a shared secret Next time devices connect, commit is computed by additionally hashing the previous DH session key before the random challenge. For sure it is not transfered when opening. c = sha256( UUID || DH pub || DH sess || Chall) Now the user is not bothered authenticating when the local party can derive the same commitment using his own, stored previous DH session key. After succesful connection the new DH session key becomes the new shared secret. As this does not exactly fit the protocols i found so far (and as such their security proofs), i'd be very interested to get an opinion from some more crypto enabled guys here. BTW. i did read about the "EKE" protocol, but i'm not sure what the extra security level is.

    Read the article

  • Akka framework support for finding duplicate messages

    - by scala_is_awesome
    I'm trying to build a high-performance distributed system with Akka and Scala. If a message requesting an expensive (and side-effect-free) computation arrives, and the exact same computation has already been requested before, I want to avoid computing the result again. If the computation requested previously has already completed and the result is available, I can cache it and re-use it. However, the time window in which duplicate computation can be requested may be arbitrarily small. e.g. I could get a thousand or a million messages requesting the same expensive computation at the same instant for all practical purposes. There is a commercial product called Gigaspaces that supposedly handles this situation. However there seems to be no framework support for dealing with duplicate work requests in Akka at the moment. Given that the Akka framework already has access to all the messages being routed through the framework, it seems that a framework solution could make a lot of sense here. Here is what I am proposing for the Akka framework to do: 1. Create a trait to indicate a type of messages (say, "ExpensiveComputation" or something similar) that are to be subject to the following caching approach. 2. Smartly (hashing etc.) identify identical messages received by (the same or different) actors within a user-configurable time window. Other options: select a maximum buffer size of memory to be used for this purpose, subject to (say LRU) replacement etc. Akka can also choose to cache only the results of messages that were expensive to process; the messages that took very little time to process can be re-processed again if needed; no need to waste precious buffer space caching them and their results. 3. When identical messages (received within that time window, possibly "at the same time instant") are identified, avoid unnecessary duplicate computations. The framework would do this automatically, and essentially, the duplicate messages would never get received by a new actor for processing; they would silently vanish and the result from processing it once (whether that computation was already done in the past, or ongoing right then) would get sent to all appropriate recipients (immediately if already available, and upon completion of the computation if not). Note that messages should be considered identical even if the "reply" fields are different, as long as the semantics/computations they represent are identical in every other respect. Also note that the computation should be purely functional, i.e. free from side-effects, for the caching optimization suggested to work and not change the program semantics at all. If what I am suggesting is not compatible with the Akka way of doing things, and/or if you see some strong reasons why this is a very bad idea, please let me know. Thanks, Is Awesome, Scala

    Read the article

  • Improving TCP performance over a gigabit network with lots of connections and high traffic of small packets

    - by MinimeDJ
    I’m trying to improve my TCP throughput over a “gigabit network with lots of connections and high traffic of small packets”. My server OS is Ubuntu 11.10 Server 64bit. There are about 50.000 (and growing) clients connected to my server through TCP Sockets (all on the same port). 95% of of my packets have size of 1-150 bytes (TCP header and payload). The rest 5% vary from 150 up to 4096+ bytes. With the config below my server can handle traffic up to 30 Mbps (full duplex). Can you please advice best practice to tune OS for my needs? My /etc/sysctl.cong looks like this: kernel.pid_max = 1000000 net.ipv4.ip_local_port_range = 2500 65000 fs.file-max = 1000000 # net.core.netdev_max_backlog=3000 net.ipv4.tcp_sack=0 # net.core.rmem_max = 16777216 net.core.wmem_max = 16777216 net.core.somaxconn = 2048 # net.ipv4.tcp_rmem = 4096 87380 16777216 net.ipv4.tcp_wmem = 4096 65536 16777216 # net.ipv4.tcp_synack_retries = 2 net.ipv4.tcp_syncookies = 1 net.ipv4.tcp_mem = 50576 64768 98152 # net.core.wmem_default = 65536 net.core.rmem_default = 65536 net.ipv4.tcp_window_scaling=1 # net.ipv4.tcp_mem= 98304 131072 196608 # net.ipv4.tcp_timestamps = 0 net.ipv4.tcp_rfc1337 = 1 net.ipv4.ip_forward = 0 net.ipv4.tcp_congestion_control=cubic net.ipv4.tcp_tw_recycle = 0 net.ipv4.tcp_tw_reuse = 0 # net.ipv4.tcp_orphan_retries = 1 net.ipv4.tcp_fin_timeout = 25 net.ipv4.tcp_max_orphans = 8192 Here are my limits: $ ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 193045 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 1000000 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 1000000 [ADDED] My NICs are the following: $ dmesg | grep Broad [ 2.473081] Broadcom NetXtreme II 5771x 10Gigabit Ethernet Driver bnx2x 1.62.12-0 (2011/03/20) [ 2.477808] bnx2x 0000:02:00.0: eth0: Broadcom NetXtreme II BCM57711E XGb (A0) PCI-E x4 5GHz (Gen2) found at mem fb000000, IRQ 28, node addr d8:d3:85:bd:23:08 [ 2.482556] bnx2x 0000:02:00.1: eth1: Broadcom NetXtreme II BCM57711E XGb (A0) PCI-E x4 5GHz (Gen2) found at mem fa000000, IRQ 40, node addr d8:d3:85:bd:23:0c [ADDED 2] ethtool -k eth0 Offload parameters for eth0: rx-checksumming: on tx-checksumming: on scatter-gather: on tcp-segmentation-offload: on udp-fragmentation-offload: off generic-segmentation-offload: on generic-receive-offload: on large-receive-offload: on rx-vlan-offload: on tx-vlan-offload: on ntuple-filters: off receive-hashing: off [ADDED 3] sudo ethtool -S eth0|grep -vw 0 NIC statistics: [1]: rx_bytes: 17521104292 [1]: rx_ucast_packets: 118326392 [1]: tx_bytes: 35351475694 [1]: tx_ucast_packets: 191723897 [2]: rx_bytes: 16569945203 [2]: rx_ucast_packets: 114055437 [2]: tx_bytes: 36748975961 [2]: tx_ucast_packets: 194800859 [3]: rx_bytes: 16222309010 [3]: rx_ucast_packets: 109397802 [3]: tx_bytes: 36034786682 [3]: tx_ucast_packets: 198238209 [4]: rx_bytes: 14884911384 [4]: rx_ucast_packets: 104081414 [4]: rx_discards: 5828 [4]: rx_csum_offload_errors: 1 [4]: tx_bytes: 35663361789 [4]: tx_ucast_packets: 194024824 [5]: rx_bytes: 16465075461 [5]: rx_ucast_packets: 110637200 [5]: tx_bytes: 43720432434 [5]: tx_ucast_packets: 202041894 [6]: rx_bytes: 16788706505 [6]: rx_ucast_packets: 113123182 [6]: tx_bytes: 38443961940 [6]: tx_ucast_packets: 202415075 [7]: rx_bytes: 16287423304 [7]: rx_ucast_packets: 110369475 [7]: rx_csum_offload_errors: 1 [7]: tx_bytes: 35104168638 [7]: tx_ucast_packets: 184905201 [8]: rx_bytes: 12689721791 [8]: rx_ucast_packets: 87616037 [8]: rx_discards: 2638 [8]: tx_bytes: 36133395431 [8]: tx_ucast_packets: 196547264 [9]: rx_bytes: 15007548011 [9]: rx_ucast_packets: 98183525 [9]: rx_csum_offload_errors: 1 [9]: tx_bytes: 34871314517 [9]: tx_ucast_packets: 188532637 [9]: tx_mcast_packets: 12 [10]: rx_bytes: 12112044826 [10]: rx_ucast_packets: 84335465 [10]: rx_discards: 2494 [10]: tx_bytes: 36562151913 [10]: tx_ucast_packets: 195658548 [11]: rx_bytes: 12873153712 [11]: rx_ucast_packets: 89305791 [11]: rx_discards: 2990 [11]: tx_bytes: 36348541675 [11]: tx_ucast_packets: 194155226 [12]: rx_bytes: 12768100958 [12]: rx_ucast_packets: 89350917 [12]: rx_discards: 2667 [12]: tx_bytes: 35730240389 [12]: tx_ucast_packets: 192254480 [13]: rx_bytes: 14533227468 [13]: rx_ucast_packets: 98139795 [13]: tx_bytes: 35954232494 [13]: tx_ucast_packets: 194573612 [13]: tx_bcast_packets: 2 [14]: rx_bytes: 13258647069 [14]: rx_ucast_packets: 92856762 [14]: rx_discards: 3509 [14]: rx_csum_offload_errors: 1 [14]: tx_bytes: 35663586641 [14]: tx_ucast_packets: 189661305 rx_bytes: 226125043936 rx_ucast_packets: 1536428109 rx_bcast_packets: 351 rx_discards: 20126 rx_filtered_packets: 8694 rx_csum_offload_errors: 11 tx_bytes: 548442367057 tx_ucast_packets: 2915571846 tx_mcast_packets: 12 tx_bcast_packets: 2 tx_64_byte_packets: 35417154 tx_65_to_127_byte_packets: 2006984660 tx_128_to_255_byte_packets: 373733514 tx_256_to_511_byte_packets: 378121090 tx_512_to_1023_byte_packets: 77643490 tx_1024_to_1522_byte_packets: 43669214 tx_pause_frames: 228 Some info about SACK: When to turn TCP SACK off?

    Read the article

  • The blocking nature of aggregates

    - by Rob Farley
    I wrote a post recently about how query tuning isn’t just about how quickly the query runs – that if you have something (such as SSIS) that is consuming your data (and probably introducing a bottleneck), then it might be more important to have a query which focuses on getting the first bit of data out. You can read that post here.  In particular, we looked at two operators that could be used to ensure that a query returns only Distinct rows. and The Sort operator pulls in all the data, sorts it (discarding duplicates), and then pushes out the remaining rows. The Hash Match operator performs a Hashing function on each row as it comes in, and then looks to see if it’s created a Hash it’s seen before. If not, it pushes the row out. The Sort method is quicker, but has to wait until it’s gathered all the data before it can do the sort, and therefore blocks the data flow. But that was my last post. This one’s a bit different. This post is going to look at how Aggregate functions work, which ties nicely into this month’s T-SQL Tuesday. I’ve frequently explained about the fact that DISTINCT and GROUP BY are essentially the same function, although DISTINCT is the poorer cousin because you have less control over it, and you can’t apply aggregate functions. Just like the operators used for Distinct, there are different flavours of Aggregate operators – coming in blocking and non-blocking varieties. The example I like to use to explain this is a pile of playing cards. If I’m handed a pile of cards and asked to count how many cards there are in each suit, it’s going to help if the cards are already ordered. Suppose I’m playing a game of Bridge, I can easily glance at my hand and count how many there are in each suit, because I keep the pile of cards in order. Moving from left to right, I could tell you I have four Hearts in my hand, even before I’ve got to the end. By telling you that I have four Hearts as soon as I know, I demonstrate the principle of a non-blocking operation. This is known as a Stream Aggregate operation. It requires input which is sorted by whichever columns the grouping is on, and it will release a row as soon as the group changes – when I encounter a Spade, I know I don’t have any more Hearts in my hand. Alternatively, if the pile of cards are not sorted, I won’t know how many Hearts I have until I’ve looked through all the cards. In fact, to count them, I basically need to put them into little piles, and when I’ve finished making all those piles, I can count how many there are in each. Because I don’t know any of the final numbers until I’ve seen all the cards, this is blocking. This performs the aggregate function using a Hash Match. Observant readers will remember this from my Distinct example. You might remember that my earlier Hash Match operation – used for Distinct Flow – wasn’t blocking. But this one is. They’re essentially doing a similar operation, applying a Hash function to some data and seeing if the set of values have been seen before, but before, it needs more information than the mere existence of a new set of values, it needs to consider how many of them there are. A lot is dependent here on whether the data coming out of the source is sorted or not, and this is largely determined by the indexes that are being used. If you look in the Properties of an Index Scan, you’ll be able to see whether the order of the data is required by the plan. A property called Ordered will demonstrate this. In this particular example, the second plan is significantly faster, but is dependent on having ordered data. In fact, if I force a Stream Aggregate on unordered data (which I’m doing by telling it to use a different index), a Sort operation is needed, which makes my plan a lot slower. This is all very straight-forward stuff, and information that most people are fully aware of. I’m sure you’ve all read my good friend Paul White (@sql_kiwi)’s post on how the Query Optimizer chooses which type of aggregate function to apply. But let’s take a look at SQL Server Integration Services. SSIS gives us a Aggregate transformation for use in Data Flow Tasks, but it’s described as Blocking. The definitive article on Performance Tuning SSIS uses Sort and Aggregate as examples of Blocking Transformations. I’ve just shown you that Aggregate operations used by the Query Optimizer are not always blocking, but that the SSIS Aggregate component is an example of a blocking transformation. But is it always the case? After all, there are plenty of SSIS Performance Tuning talks out there that describe the value of sorted data in Data Flow Tasks, describing the IsSorted property that can be set through the Advanced Editor of your Source component. And so I set about testing the Aggregate transformation in SSIS, to prove for sure whether providing Sorted data would let the Aggregate transform behave like a Stream Aggregate. (Of course, I knew the answer already, but it helps to be able to demonstrate these things). A query that will produce a million rows in order was in order. Let me rephrase. I used a query which produced the numbers from 1 to 1000000, in a single field, ordered. The IsSorted flag was set on the source output, with the only column as SortKey 1. Performing an Aggregate function over this (counting the number of rows per distinct number) should produce an additional column with 1 in it. If this were being done in T-SQL, the ordered data would allow a Stream Aggregate to be used. In fact, if the Query Optimizer saw that the field had a Unique Index on it, it would be able to skip the Aggregate function completely, and just insert the value 1. This is a shortcut I wouldn’t be expecting from SSIS, but certainly the Stream behaviour would be nice. Unfortunately, it’s not the case. As you can see from the screenshots above, the data is pouring into the Aggregate function, and not being released until all million rows have been seen. It’s not doing a Stream Aggregate at all. This is expected behaviour. (I put that in bold, because I want you to realise this.) An SSIS transformation is a piece of code that runs. It’s a physical operation. When you write T-SQL and ask for an aggregation to be done, it’s a logical operation. The physical operation is either a Stream Aggregate or a Hash Match. In SSIS, you’re telling the system that you want a generic Aggregation, that will have to work with whatever data is passed in. I’m not saying that it wouldn’t be possible to make a sometimes-blocking aggregation component in SSIS. A Custom Component could be created which could detect whether the SortKeys columns of the input matched the Grouping columns of the Aggregation, and either call the blocking code or the non-blocking code as appropriate. One day I’ll make one of those, and publish it on my blog. I’ve done it before with a Script Component, but as Script components are single-use, I was able to handle the data knowing everything about my data flow already. As per my previous post – there are a lot of aspects in which tuning SSIS and tuning execution plans use similar concepts. In both situations, it really helps to have a feel for what’s going on behind the scenes. Considering whether an operation is blocking or not is extremely relevant to performance, and that it’s not always obvious from the surface. In a future post, I’ll show the impact of blocking v non-blocking and synchronous v asynchronous components in SSIS, using some of LobsterPot’s Script Components and Custom Components as examples. When I get that sorted, I’ll make a Stream Aggregate component available for download.

    Read the article

  • The blocking nature of aggregates

    - by Rob Farley
    I wrote a post recently about how query tuning isn’t just about how quickly the query runs – that if you have something (such as SSIS) that is consuming your data (and probably introducing a bottleneck), then it might be more important to have a query which focuses on getting the first bit of data out. You can read that post here.  In particular, we looked at two operators that could be used to ensure that a query returns only Distinct rows. and The Sort operator pulls in all the data, sorts it (discarding duplicates), and then pushes out the remaining rows. The Hash Match operator performs a Hashing function on each row as it comes in, and then looks to see if it’s created a Hash it’s seen before. If not, it pushes the row out. The Sort method is quicker, but has to wait until it’s gathered all the data before it can do the sort, and therefore blocks the data flow. But that was my last post. This one’s a bit different. This post is going to look at how Aggregate functions work, which ties nicely into this month’s T-SQL Tuesday. I’ve frequently explained about the fact that DISTINCT and GROUP BY are essentially the same function, although DISTINCT is the poorer cousin because you have less control over it, and you can’t apply aggregate functions. Just like the operators used for Distinct, there are different flavours of Aggregate operators – coming in blocking and non-blocking varieties. The example I like to use to explain this is a pile of playing cards. If I’m handed a pile of cards and asked to count how many cards there are in each suit, it’s going to help if the cards are already ordered. Suppose I’m playing a game of Bridge, I can easily glance at my hand and count how many there are in each suit, because I keep the pile of cards in order. Moving from left to right, I could tell you I have four Hearts in my hand, even before I’ve got to the end. By telling you that I have four Hearts as soon as I know, I demonstrate the principle of a non-blocking operation. This is known as a Stream Aggregate operation. It requires input which is sorted by whichever columns the grouping is on, and it will release a row as soon as the group changes – when I encounter a Spade, I know I don’t have any more Hearts in my hand. Alternatively, if the pile of cards are not sorted, I won’t know how many Hearts I have until I’ve looked through all the cards. In fact, to count them, I basically need to put them into little piles, and when I’ve finished making all those piles, I can count how many there are in each. Because I don’t know any of the final numbers until I’ve seen all the cards, this is blocking. This performs the aggregate function using a Hash Match. Observant readers will remember this from my Distinct example. You might remember that my earlier Hash Match operation – used for Distinct Flow – wasn’t blocking. But this one is. They’re essentially doing a similar operation, applying a Hash function to some data and seeing if the set of values have been seen before, but before, it needs more information than the mere existence of a new set of values, it needs to consider how many of them there are. A lot is dependent here on whether the data coming out of the source is sorted or not, and this is largely determined by the indexes that are being used. If you look in the Properties of an Index Scan, you’ll be able to see whether the order of the data is required by the plan. A property called Ordered will demonstrate this. In this particular example, the second plan is significantly faster, but is dependent on having ordered data. In fact, if I force a Stream Aggregate on unordered data (which I’m doing by telling it to use a different index), a Sort operation is needed, which makes my plan a lot slower. This is all very straight-forward stuff, and information that most people are fully aware of. I’m sure you’ve all read my good friend Paul White (@sql_kiwi)’s post on how the Query Optimizer chooses which type of aggregate function to apply. But let’s take a look at SQL Server Integration Services. SSIS gives us a Aggregate transformation for use in Data Flow Tasks, but it’s described as Blocking. The definitive article on Performance Tuning SSIS uses Sort and Aggregate as examples of Blocking Transformations. I’ve just shown you that Aggregate operations used by the Query Optimizer are not always blocking, but that the SSIS Aggregate component is an example of a blocking transformation. But is it always the case? After all, there are plenty of SSIS Performance Tuning talks out there that describe the value of sorted data in Data Flow Tasks, describing the IsSorted property that can be set through the Advanced Editor of your Source component. And so I set about testing the Aggregate transformation in SSIS, to prove for sure whether providing Sorted data would let the Aggregate transform behave like a Stream Aggregate. (Of course, I knew the answer already, but it helps to be able to demonstrate these things). A query that will produce a million rows in order was in order. Let me rephrase. I used a query which produced the numbers from 1 to 1000000, in a single field, ordered. The IsSorted flag was set on the source output, with the only column as SortKey 1. Performing an Aggregate function over this (counting the number of rows per distinct number) should produce an additional column with 1 in it. If this were being done in T-SQL, the ordered data would allow a Stream Aggregate to be used. In fact, if the Query Optimizer saw that the field had a Unique Index on it, it would be able to skip the Aggregate function completely, and just insert the value 1. This is a shortcut I wouldn’t be expecting from SSIS, but certainly the Stream behaviour would be nice. Unfortunately, it’s not the case. As you can see from the screenshots above, the data is pouring into the Aggregate function, and not being released until all million rows have been seen. It’s not doing a Stream Aggregate at all. This is expected behaviour. (I put that in bold, because I want you to realise this.) An SSIS transformation is a piece of code that runs. It’s a physical operation. When you write T-SQL and ask for an aggregation to be done, it’s a logical operation. The physical operation is either a Stream Aggregate or a Hash Match. In SSIS, you’re telling the system that you want a generic Aggregation, that will have to work with whatever data is passed in. I’m not saying that it wouldn’t be possible to make a sometimes-blocking aggregation component in SSIS. A Custom Component could be created which could detect whether the SortKeys columns of the input matched the Grouping columns of the Aggregation, and either call the blocking code or the non-blocking code as appropriate. One day I’ll make one of those, and publish it on my blog. I’ve done it before with a Script Component, but as Script components are single-use, I was able to handle the data knowing everything about my data flow already. As per my previous post – there are a lot of aspects in which tuning SSIS and tuning execution plans use similar concepts. In both situations, it really helps to have a feel for what’s going on behind the scenes. Considering whether an operation is blocking or not is extremely relevant to performance, and that it’s not always obvious from the surface. In a future post, I’ll show the impact of blocking v non-blocking and synchronous v asynchronous components in SSIS, using some of LobsterPot’s Script Components and Custom Components as examples. When I get that sorted, I’ll make a Stream Aggregate component available for download.

    Read the article

  • Deterministic/Consistent Unique Masking

    - by Dinesh Rajasekharan-Oracle
    One of the key requirements while masking data in large databases or multi database environment is to consistently mask some columns, i.e. for a given input the output should always be the same. At the same time the masked output should not be predictable. Deterministic masking also eliminates the need to spend enormous amount of time spent in identifying data relationships, i.e. parent and child relationships among columns defined in the application tables. In this blog post I will explain different ways of consistently masking the data across databases using Oracle Data Masking and Subsetting The readers of post should have minimal knowledge on Oracle Enterprise Manager 12c, Application Data Modeling, Data Masking concepts. For more information on these concepts, please refer to Oracle Data Masking and Subsetting document Oracle Data Masking and Subsetting 12c provides four methods using which users can consistently yet irreversibly mask their inputs. 1. Substitute 2. SQL Expression 3. Encrypt 4. User Defined Function SUBSTITUTE The substitute masking format replaces the original value with a value from a pre-created database table. As the method uses a hash based algorithm in the back end the mappings are consistent. For example consider DEPARTMENT_ID in EMPLOYEES table is replaced with FAKE_DEPARTMENT_ID from FAKE_TABLE. The substitute masking transformation that all occurrences of DEPARTMENT_ID say ‘101’ will be replaced with ‘502’ provided same substitution table and column is used , i.e. FAKE_TABLE.FAKE_DEPARTMENT_ID. The following screen shot shows the usage of the Substitute masking format with in a masking definition: Note that the uniqueness of the masked value depends on the number of columns being used in the substitution table i.e. if the original table contains 50000 unique values, then for the masked output to be unique and deterministic the substitution column should also contain 50000 unique values without which only consistency is maintained but not uniqueness. SQL EXPRESSION SQL Expression replaces an existing value with the output of a specified SQL Expression. For example while masking an EMPLOYEES table the EMAIL_ID of an employee has to be in the format EMPLOYEE’s [email protected] while FIRST_NAME and LAST_NAME are the actual column names of the EMPLOYEES table then the corresponding SQL Expression will look like %FIRST_NAME%||’.’||%LAST_NAME%||’@COMPANY.COM’. The advantage of this technique is that if you are masking FIRST_NAME and LAST_NAME of the EMPLOYEES table than the corresponding EMAIL ID will be replaced accordingly by the masking scripts. One of the interesting aspect’s of a SQL Expressions is that you can use sub SQL expressions, which means that you can write a nested SQL and use it as SQL Expression to address a complex masking business use cases. SQL Expression can also be used to consistently replace value with hashed value using Oracle’s PL/SQL function ORA_HASH. The following SQL Expression will help in the previous example for replacing the DEPARTMENT_IDs with a hashed number ORA_HASH (%DEPARTMENT_ID%, 1000) The following screen shot shows the usage of encrypt masking format with in the masking definition: ORA_HASH takes three arguments: 1. Expression which can be of any data type except LONG, LOB, User Defined Type [nested table type is allowed]. In the above example I used the Original value as expression. 2. Number of hash buckets which can be number between 0 and 4294967295. The default value is 4294967295. You can also co-relate the number of hash buckets to a range of numbers. In the above example above the bucket value is specified as 1000, so the end result will be a hashed number in between 0 and 1000. 3. Seed, can be any number which decides the consistency, i.e. for a given seed value the output will always be same. The default seed is 0. In the above SQL Expression a seed in not specified, so it to 0. If you have to use a non default seed then the function will look like. ORA_HASH (%DEPARTMENT_ID%, 1000, 1234 The uniqueness depends on the input and the number of hash buckets used. However as ORA_HASH uses a 32 bit algorithm, considering birthday paradox or pigeonhole principle there is a 0.5 probability of collision after 232-1 unique values. ENCRYPT Encrypt masking format uses a blend of 3DES encryption algorithm, hashing, and regular expression to produce a deterministic and unique masked output. The format of the masked output corresponds to the specified regular expression. As this technique uses a key [string] to encrypt the data, the same string can be used to decrypt the data. The key also acts as seed to maintain consistent outputs for a given input. The following screen shot shows the usage of encrypt masking format with in the masking definition: Regular Expressions may look complex for the first time users but you will soon realize that it’s a simple language. There are many resources in internet, oracle documentation, oracle learning library, my oracle support on writing a Regular Expressions, out of all the following My Oracle Support document helped me to get started with Regular Expressions: Oracle SQL Support for Regular Expressions[Video](Doc ID 1369668.1) USER DEFINED FUNCTION [UDF] User Defined Function or UDF provides flexibility for the users to code their own masking logic in PL/SQL, which can be called from masking Defintion. The standard format of an UDF in Oracle Data Masking and Subsetting is: Function udf_func (rowid varchar2, column_name varchar2, original_value varchar2) returns varchar2; Where • rowid is the row identifier of the column that needs to be masked • column_name is the name of the column that needs to be masked • original_value is the column value that needs to be masked You can achieve deterministic masking by using Oracle’s built in hash functions like, ORA_HASH, DBMS_CRYPTO.MD4, DBMS_CRYPTO.MD5, DBMS_UTILITY. GET_HASH_VALUE.Please refers to the Oracle Database Documentation for more information on the Oracle Hash functions. For example the following masking UDF generate deterministic unique hexadecimal values for a given string input: CREATE OR REPLACE FUNCTION RD_DUX (rid varchar2, column_name varchar2, orig_val VARCHAR2) RETURN VARCHAR2 DETERMINISTIC PARALLEL_ENABLE IS stext varchar2 (26); no_of_characters number(2); BEGIN no_of_characters:=6; stext:=substr(RAWTOHEX(DBMS_CRYPTO.HASH(UTL_RAW.CAST_TO_RAW(text),1)),0,no_of_characters); RETURN stext; END; The uniqueness depends on the input and length of the string and number of bits used by hash algorithm. In the above function MD4 hash is used [denoted by argument 1 in the DBMS_CRYPTO.HASH function which is a 128 bit algorithm which produces 2^128-1 unique hashed values , however this is limited by the length of the input string which is 6, so only 6^6 unique values will be generated. Also do not forget about the birthday paradox/pigeonhole principle mentioned earlier in this post. An another example is to consistently replace characters or numbers preserving the length and special characters as shown below: CREATE OR REPLACE FUNCTION RD_DUS(rid varchar2,column_name varchar2,orig_val VARCHAR2) RETURN VARCHAR2 DETERMINISTIC PARALLEL_ENABLE IS stext varchar2(26); BEGIN DBMS_RANDOM.SEED(orig_val); stext:=TRANSLATE(orig_val,'ABCDEFGHILKLMNOPQRSTUVWXYZ',DBMS_RANDOM.STRING('U',26)); stext:=TRANSLATE(stext,'abcdefghijklmnopqrstuvwxyz',DBMS_RANDOM.STRING('L',26)); stext:=TRANSLATE(stext,'0123456789',to_char(DBMS_RANDOM.VALUE(1,9))); stext:=REPLACE(stext,'.','0'); RETURN stext; END; The following screen shot shows the usage of an UDF with in a masking definition: To summarize, Oracle Data Masking and Subsetting helps you to consistently mask data across databases using one or all of the methods described in this post. It saves the hassle of identifying the parent-child relationships defined in the application table. Happy Masking

    Read the article

  • Load-balancing between a Procurve switch and a server

    - by vlad
    Hello I've been searching around the web for this problem i've been having. It's similar in a way to this question: How exactly & specifically does layer 3 LACP destination address hashing work? My setup is as follows: I have a central switch, a Procurve 2510G-24, image version Y.11.16. It's the center of a star topology, there are four switches connected to it via a single gigabit link. Those switches service the users. On the central switch, I have a server with two gigabit interfaces that I want to bond together in order to achieve higher throughput, and two other servers that have single gigabit connections to the switch. The topology looks as follows: sw1 sw2 sw3 sw4 | | | | --------------------- | sw0 | --------------------- || | | srv1 srv2 srv3 The servers were running FreeBSD 8.1. On srv1 I set up a lagg interface using the lacp protocol, and on the switch I set up a trunk for the two ports using lacp as well. The switch showed that the server was a lacp partner, I could ping the server from another computer, and the server could ping other computers. If I unplugged one of the cables, the connection would keep working, so everything looked fine. Until I tested throughput. There was only one link used between srv1 and sw0. All testing was conducted with iperf, and load distribution was checked with systat -ifstat. I was looking to test the load balancing for both receive and send operations, as I want this server to be a file server. There were therefore two scenarios: iperf -s on srv1 and iperf -c on the other servers iperf -s on the other servers and iperf -c on srv1 connected to all the other servers. Every time only one link was used. If one cable was unplugged, the connections would keep going. However, once the cable was plugged back in, the load was not distributed. Each and every server is able to fill the gigabit link. In one-to-one test scenarios, iperf was reporting around 940Mbps. The CPU usage was around 20%, which means that the servers could withstand a doubling of the throughput. srv1 is a dell poweredge sc1425 with onboard intel 82541GI nics (em driver on freebsd). After troubleshooting a previous problem with vlan tagging on top of a lagg interface, it turned out that the em could not support this. So I figured that maybe something else is wrong with the em drivers and / or lagg stack, so I started up backtrack 4r2 on this same server. So srv1 now uses linux kernel 2.6.35.8. I set up a bonding interface bond0. The kernel module was loaded with option mode=4 in order to get lacp. The switch was happy with the link, I could ping to and from the server. I could even put vlans on top of the bonding interface. However, only half the problem was solved: if I used srv1 as a client to the other servers, iperf was reporting around 940Mbps for each connection, and bwm-ng showed, of course, a nice distribution of the load between the two nics; if I run the iperf server on srv1 and tried to connect with the other servers, there was no load balancing. I thought that maybe I was out of luck and the hashes for the two mac addresses of the clients were the same, so I brought in two new servers and tested with the four of them at the same time, and still nothing changed. I tried disabling and reenabling one of the links, and all that happened was the traffic switched from one link to the other and back to the first again. I also tried setting the trunk to "plain trunk mode" on the switch, and experimented with other bonding modes (roundrobin, xor, alb, tlb) but I never saw any traffic distribution. One interesting thing, though: one of the four switches is a Cisco 2950, image version 12.1(22)EA7. It has 48 10/100 ports and 2 gigabit uplinks. I have a server (call it srv4) with a 4 channel trunk connected to it (4x100), FreeBSD 8.0 release. The switch is connected to sw0 via gigabit. If I set up an iperf server on one of the servers connected to sw0 and a client on srv4, ALL 4 links are used, and iperf reports around 330Mbps. systat -ifstat shows all four interfaces are used. The cisco port-channel uses src-mac to balance the load. The HP should use both the source and destination according to the manual, so it should work as well. Could this mean there is some bug in the HP firmware? Am I doing something wrong?

    Read the article

  • Random Complete System Unresponsiveness Running Mathematical Functions

    - by Computer Guru
    I have a program that loads a file (anywhere from 10MB to 5GB) a chunk at a time (ReadFile), and for each chunk performs a set of mathematical operations (basically calculates the hash). After calculating the hash, it stores info about the chunk in an STL map (basically <chunkID, hash>) and then writes the chunk itself to another file (WriteFile). That's all it does. This program will cause certain PCs to choke and die. The mouse begins to stutter, the task manager takes 2 min to show, ctrl+alt+del is unresponsive, running programs are slow.... the works. I've done literally everything I can think of to optimize the program, and have triple-checked all objects. What I've done: Tried different (less intensive) hashing algorithms. Switched all allocations to nedmalloc instead of the default new operator Switched from stl::map to unordered_set, found the performance to still be abysmal, so I switched again to Google's dense_hash_map. Converted all objects to store pointers to objects instead of the objects themselves. Caching all Read and Write operations. Instead of reading a 16k chunk of the file and performing the math on it, I read 4MB into a buffer and read 16k chunks from there instead. Same for all write operations - they are coalesced into 4MB blocks before being written to disk. Run extensive profiling with Visual Studio 2010, AMD Code Analyst, and perfmon. Set the thread priority to THREAD_MODE_BACKGROUND_BEGIN Set the thread priority to THREAD_PRIORITY_IDLE Added a Sleep(100) call after every loop. Even after all this, the application still results in a system-wide hang on certain machines under certain circumstances. Perfmon and Process Explorer show minimal CPU usage (with the sleep), no constant reads/writes from disk, few hard pagefaults (and only ~30k pagefaults in the lifetime of the application on a 5GB input file), little virtual memory (never more than 150MB), no leaked handles, no memory leaks. The machines I've tested it on run Windows XP - Windows 7, x86 and x64 versions included. None have less than 2GB RAM, though the problem is always exacerbated under lower memory conditions. I'm at a loss as to what to do next. I don't know what's causing it - I'm torn between CPU or Memory as the culprit. CPU because without the sleep and under different thread priorities the system performances changes noticeably. Memory because there's a huge difference in how often the issue occurs when using unordered_set vs Google's dense_hash_map. What's really weird? Obviously, the NT kernel design is supposed to prevent this sort of behavior from ever occurring (a user-mode application driving the system to this sort of extreme poor performance!?)..... but when I compile the code and run it on OS X or Linux (it's fairly standard C++ throughout) it performs excellently even on poor machines with little RAM and weaker CPUs. What am I supposed to do next? How do I know what the hell it is that Windows is doing behind the scenes that's killing system performance, when all the indicators are that the application itself isn't doing anything extreme? Any advice would be most welcome.

    Read the article

  • algorithm q: Fuzzy matching of structured data

    - by user86432
    I have a fairly small corpus of structured records sitting in a database. Given a tiny fraction of the information contained in a single record, submitted via a web form (so structured in the same way as the table schema), (let us call it the test record) I need to quickly draw up a list of the records that are the most likely matches for the test record, as well as provide a confidence estimate of how closely the search terms match a record. The primary purpose of this search is to discover whether someone is attempting to input a record that is duplicate to one in the corpus. There is a reasonable chance that the test record will be a dupe, and a reasonable chance the test record will not be a dupe. The records are about 12000 bytes wide and the total count of records is about 150,000. There are 110 columns in the table schema and 95% of searches will be on the top 5% most commonly searched columns. The data is stuff like names, addresses, telephone numbers, and other industry specific numbers. In both the corpus and the test record it is entered by hand and is semistructured within an individual field. You might at first blush say "weight the columns by hand and match word tokens within them", but it's not so easy. I thought so too: if I get a telephone number I thought that would indicate a perfect match. The problem is that there isn't a single field in the form whose token frequency does not vary by orders of magnitude. A telephone number might appear 100 times in the corpus or 1 time in the corpus. The same goes for any other field. This makes weighting at the field level impractical. I need a more fine-grained approach to get decent matching. My initial plan was to create a hash of hashes, top level being the fieldname. Then I would select all of the information from the corpus for a given field, attempt to clean up the data contained in it, and tokenize the sanitized data, hashing the tokens at the second level, with the tokens as keys and frequency as value. I would use the frequency count as a weight: the higher the frequency of a token in the reference corpus, the less weight I attach to that token if it is found in the test record. My first question is for the statisticians in the room: how would I use the frequency as a weight? Is there a precise mathematical relationship between n, the number of records, f(t), the frequency with which a token t appeared in the corpus, the probability o that a record is an original and not a duplicate, and the probability p that the test record is really a record x given the test and x contain the same t in the same field? How about the relationship for multiple token matches across multiple fields? Since I sincerely doubt that there is, is there anything that gets me close but is better than a completely arbitrary hack full of magic factors? Barring that, has anyone got a way to do this? I'm especially keen on other suggestions that do not involve maintaining another table in the database, such as a token frequency lookup table :). This is my first post on StackOverflow, thanks in advance for any replies you may see fit to give.

    Read the article

  • SQL SERVER – Guest Post – Jonathan Kehayias – Wait Type – Day 16 of 28

    - by pinaldave
    Jonathan Kehayias (Blog | Twitter) is a MCITP Database Administrator and Developer, who got started in SQL Server in 2004 as a database developer and report writer in the natural gas industry. After spending two and a half years working in TSQL, in late 2006, he transitioned to the role of SQL Database Administrator. His primary passion is performance tuning, where he frequently rewrites queries for better performance and performs in depth analysis of index implementation and usage. Jonathan blogs regularly on SQLBlog, and was a coauthor of Professional SQL Server 2008 Internals and Troubleshooting. On a personal note, I think Jonathan is extremely positive person. In every conversation with him I have found that he is always eager to help and encourage. Every time he finds something needs to be approved, he has contacted me without hesitation and guided me to improve, change and learn. During all the time, he has not lost his focus to help larger community. I am honored that he has accepted to provide his views on complex subject of Wait Types and Queues. Currently I am reading his series on Extended Events. Here is the guest blog post by Jonathan: SQL Server troubleshooting is all about correlating related pieces of information together to indentify where exactly the root cause of a problem lies. In my daily work as a DBA, I generally get phone calls like, “So and so application is slow, what’s wrong with the SQL Server.” One of the funny things about the letters DBA is that they go so well with Default Blame Acceptor, and I really wish that I knew exactly who the first person was that pointed that out to me, because it really fits at times. A lot of times when I get this call, the problem isn’t related to SQL Server at all, but every now and then in my initial quick checks, something pops up that makes me start looking at things further. The SQL Server is slow, we see a number of tasks waiting on ASYNC_IO_COMPLETION, IO_COMPLETION, or PAGEIOLATCH_* waits in sys.dm_exec_requests and sys.dm_exec_waiting_tasks. These are also some of the highest wait types in sys.dm_os_wait_stats for the server, so it would appear that we have a disk I/O bottleneck on the machine. A quick check of sys.dm_io_virtual_file_stats() and tempdb shows a high write stall rate, while our user databases show high read stall rates on the data files. A quick check of some performance counters and Page Life Expectancy on the server is bouncing up and down in the 50-150 range, the Free Page counter consistently hits zero, and the Free List Stalls/sec counter keeps jumping over 10, but Buffer Cache Hit Ratio is 98-99%. Where exactly is the problem? In this case, which happens to be based on a real scenario I faced a few years back, the problem may not be a disk bottleneck at all; it may very well be a memory pressure issue on the server. A quick check of the system spec’s and it is a dual duo core server with 8GB RAM running SQL Server 2005 SP1 x64 on Windows Server 2003 R2 x64. Max Server memory is configured at 6GB and we think that this should be enough to handle the workload; or is it? This is a unique scenario because there are a couple of things happening inside of this system, and they all relate to what the root cause of the performance problem is on the system. If we were to query sys.dm_exec_query_stats for the TOP 10 queries, by max_physical_reads, max_logical_reads, and max_worker_time, we may be able to find some queries that were using excessive I/O and possibly CPU against the system in their worst single execution. We can also CROSS APPLY to sys.dm_exec_sql_text() and see the statement text, and also CROSS APPLY sys.dm_exec_query_plan() to get the execution plan stored in cache. Ok, quick check, the plans are pretty big, I see some large index seeks, that estimate 2.8GB of data movement between operators, but everything looks like it is optimized the best it can be. Nothing really stands out in the code, and the indexing looks correct, and I should have enough memory to handle this in cache, so it must be a disk I/O problem right? Not exactly! If we were to look at how much memory the plan cache is taking by querying sys.dm_os_memory_clerks for the CACHESTORE_SQLCP and CACHESTORE_OBJCP clerks we might be surprised at what we find. In SQL Server 2005 RTM and SP1, the plan cache was allowed to take up to 75% of the memory under 8GB. I’ll give you a second to go back and read that again. Yes, you read it correctly, it says 75% of the memory under 8GB, but you don’t have to take my word for it, you can validate this by reading Changes in Caching Behavior between SQL Server 2000, SQL Server 2005 RTM and SQL Server 2005 SP2. In this scenario the application uses an entirely adhoc workload against SQL Server and this leads to plan cache bloat, and up to 4.5GB of our 6GB of memory for SQL can be consumed by the plan cache in SQL Server 2005 SP1. This in turn reduces the size of the buffer cache to just 1.5GB, causing our 2.8GB of data movement in this expensive plan to cause complete flushing of the buffer cache, not just once initially, but then another time during the queries execution, resulting in excessive physical I/O from disk. Keep in mind that this is not the only query executing at the time this occurs. Remember the output of sys.dm_io_virtual_file_stats() showed high read stalls on the data files for our user databases versus higher write stalls for tempdb? The memory pressure is also forcing heavier use of tempdb to handle sorting and hashing in the environment as well. The real clue here is the Memory counters for the instance; Page Life Expectancy, Free List Pages, and Free List Stalls/sec. The fact that Page Life Expectancy is fluctuating between 50 and 150 constantly is a sign that the buffer cache is experiencing constant churn of data, once every minute to two and a half minutes. If you add to the Page Life Expectancy counter, the consistent bottoming out of Free List Pages along with Free List Stalls/sec consistently spiking over 10, and you have the perfect memory pressure scenario. All of sudden it may not be that our disk subsystem is the problem, but is instead an innocent bystander and victim. Side Note: The Page Life Expectancy counter dropping briefly and then returning to normal operating values intermittently is not necessarily a sign that the server is under memory pressure. The Books Online and a number of other references will tell you that this counter should remain on average above 300 which is the time in seconds a page will remain in cache before being flushed or aged out. This number, which equates to just five minutes, is incredibly low for modern systems and most published documents pre-date the predominance of 64 bit computing and easy availability to larger amounts of memory in SQL Servers. As food for thought, consider that my personal laptop has more memory in it than most SQL Servers did at the time those numbers were posted. I would argue that today, a system churning the buffer cache every five minutes is in need of some serious tuning or a hardware upgrade. Back to our problem and its investigation: There are two things really wrong with this server; first the plan cache is excessively consuming memory and bloated in size and we need to look at that and second we need to evaluate upgrading the memory to accommodate the workload being performed. In the case of the server I was working on there were a lot of single use plans found in sys.dm_exec_cached_plans (where usecounts=1). Single use plans waste space in the plan cache, especially when they are adhoc plans for statements that had concatenated filter criteria that is not likely to reoccur with any frequency.  SQL Server 2005 doesn’t natively have a way to evict a single plan from cache like SQL Server 2008 does, but MVP Kalen Delaney, showed a hack to evict a single plan by creating a plan guide for the statement and then dropping that plan guide in her blog post Geek City: Clearing a Single Plan from Cache. We could put that hack in place in a job to automate cleaning out all the single use plans periodically, minimizing the size of the plan cache, but a better solution would be to fix the application so that it uses proper parameterized calls to the database. You didn’t write the app, and you can’t change its design? Ok, well you could try to force parameterization to occur by creating and keeping plan guides in place, or we can try forcing parameterization at the database level by using ALTER DATABASE <dbname> SET PARAMETERIZATION FORCED and that might help. If neither of these help, we could periodically dump the plan cache for that database, as discussed as being a problem in Kalen’s blog post referenced above; not an ideal scenario. The other option is to increase the memory on the server to 16GB or 32GB, if the hardware allows it, which will increase the size of the plan cache as well as the buffer cache. In SQL Server 2005 SP1, on a system with 16GB of memory, if we set max server memory to 14GB the plan cache could use at most 9GB  [(8GB*.75)+(6GB*.5)=(6+3)=9GB], leaving 5GB for the buffer cache.  If we went to 32GB of memory and set max server memory to 28GB, the plan cache could use at most 16GB [(8*.75)+(20*.5)=(6+10)=16GB], leaving 12GB for the buffer cache. Thankfully we have SQL Server 2005 Service Pack 2, 3, and 4 these days which include the changes in plan cache sizing discussed in the Changes to Caching Behavior between SQL Server 2000, SQL Server 2005 RTM and SQL Server 2005 SP2 blog post. In real life, when I was troubleshooting this problem, I spent a week trying to chase down the cause of the disk I/O bottleneck with our Server Admin and SAN Admin, and there wasn’t much that could be done immediately there, so I finally asked if we could increase the memory on the server to 16GB, which did fix the problem. It wasn’t until I had this same problem occur on another system that I actually figured out how to really troubleshoot this down to the root cause.  I couldn’t believe the size of the plan cache on the server with 16GB of memory when I actually learned about this and went back to look at it. SQL Server is constantly telling a story to anyone that will listen. As the DBA, you have to sit back and listen to all that it’s telling you and then evaluate the big picture and how all the data you can gather from SQL about performance relate to each other. One of the greatest tools out there is actually a free in the form of Diagnostic Scripts for SQL Server 2005 and 2008, created by MVP Glenn Alan Berry. Glenn’s scripts collect a majority of the information that SQL has to offer for rapid troubleshooting of problems, and he includes a lot of notes about what the outputs of each individual query might be telling you. When I read Pinal’s blog post SQL SERVER – ASYNC_IO_COMPLETION – Wait Type – Day 11 of 28, I noticed that he referenced Checking Memory Related Performance Counters in his post, but there was no real explanation about why checking memory counters is so important when looking at an I/O related wait type. I thought I’d chat with him briefly on Google Talk/Twitter DM and point this out, and offer a couple of other points I noted, so that he could add the information to his blog post if he found it useful.  Instead he asked that I write a guest blog for this. I am honored to be a guest blogger, and to be able to share this kind of information with the community. The information contained in this blog post is a glimpse at how I do troubleshooting almost every day of the week in my own environment. SQL Server provides us with a lot of information about how it is running, and where it may be having problems, it is up to us to play detective and find out how all that information comes together to tell us what’s really the problem. This blog post is written by Jonathan Kehayias (Blog | Twitter). Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: MVP, Pinal Dave, PostADay, Readers Contribution, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, SQL Wait Stats, SQL Wait Types, T SQL, Technology

    Read the article

  • CodePlex Daily Summary for Friday, October 04, 2013

    CodePlex Daily Summary for Friday, October 04, 2013Popular ReleasesMoreTerra (Terraria World Viewer): Version 1.11: Release Notes Release 1.11 =========== =Bug Fixes= =========== Now works with Terraria 1.2 wld files. =============== =Known Issues= =============== Not all tiles and items are accounted for. Missing tiles just show up as pink. This is actively being worked on but wanted to get a build out that works with 1.2VG-Ripper & PG-Ripper: PG-Ripper 1.4.19: NEW: Added Option to login as Guest NEW: Added Support for "ImageTeam.org linksStyleMVVM: 3.1.4: This release has virtually no code change but adds multiple new Item templates for the Windows Phone 8 platformSystem Center Orchestrator Community Project: Orchestrator Visio and Word Generator 1.5: This tool lets you export Orchestrator runbooks as a Visio diagram, and you can also generate an optional Word file as well. Components exported as of v1.5 are : - Title of the runbook - Activities and their names/thumbnails/description (description is displayed as a callout in the Visio diagram, attached to the shape of the activity) - Links and their names/colors - Looping and their interval Thumbnails and activities are grouped in the Visio diagram, for easy manipulation of the diagra...State of Decay Save Manager: Version 1.0.4: Add version at bottom of formDNN® Form and List: DNN Form and List 06.00.07: DotNetNuke Form and List 06.00.06 Changes to 6.0.7•Fixed an error in datatypes.config that caused calculated fields to be missing in 6.0.6 Changes to 6.0.6•Add in Sql to remove 'text on row' setting for UserDefinedTable to make SQL Azure compatible. •Add new azureCompatible element to manifest. •Added a fix for importing templates. Changes to 6.0.2•Fix: MakeThumbnail was broken if the application pool was configured to .Net 4 •Change: Data is now stored in nvarchar(max) instead of ntext C...SimpleExcelReportMaker: Serm 0.03: SourceCode and Sample .Net Framework 3.5 AnyCPU compile.RDFSharp - Start playing with RDF!: RDFSharp-0.6.6: GENERAL (NEW) Introduction of INT64 hashing engine (codenamed "Greta"); QUERY (FIX) Incorrect query evaluation due to faulty detection of optional patterns (v0.6.5 regression); (FIX) Missing update of PatternGroupID information after adding patterns and filters to a pattern group; (FIX) Ensure Context information of a pattern is not null before trying to collect it as variable; (MISC) Changed semantics of Context information of a pattern: if not provided, it will be ignored; (MISC...Application Architecture Guidelines: App Architecture Guidelines 3.0.8: This document is an overview of software qualities, principles, patterns, practices, tools and libraries.C# Intellisense for Notepad++: Release v1.0.7.2: - smart indentation - document formatting To avoid the DLLs getting locked by OS use MSI file for the installation.BlackJumboDog: Ver5.9.6: 2013.09.30 Ver5.9.6 (1)SMTP???????、???????????????? (2)WinAPI??????? (3)Web???????CGI???????????????????????Microsoft Ajax Minifier: Microsoft Ajax Minifier 5.2: Mostly internal code tweaks. added -nosize switch to turn off the size- and gzip-calculations done after minification. removed the comments in the build targets script for the old AjaxMin build task (discussion #458831). Fixed an issue with extended Unicode characters encoded inside a string literal with adjacent \uHHHH\uHHHH sequences. Fixed an IndexOutOfRange exception when encountering a CSS identifier that's a single underscore character (_). In previous builds, the net35 and net20...AJAX Control Toolkit: September 2013 Release: AJAX Control Toolkit Release Notes - September 2013 Release (Updated) Version 7.1002September 2013 release of the AJAX Control Toolkit. AJAX Control Toolkit .NET 4.5 – AJAX Control Toolkit for .NET 4.5 and sample site (Recommended). AJAX Control Toolkit .NET 4 – AJAX Control Toolkit for .NET 4 and sample site (Recommended). AJAX Control Toolkit .NET 3.5 – AJAX Control Toolkit for .NET 3.5 and sample site (Recommended). Important UpdateThis release has been updated to fix two issues: Upda...WDTVHubGen - Adds Metadata, thumbnails and subtitles to WDTV Live Hubs: WDTVHubGen.v2.1.4.apifix-alpha: WDTVHubGen.v2.1.4.apifix-alpha is for testers to figure out if we got the NEW api plugged in ok. thanksVisual Log Parser: VisualLogParser: Portable Visual Log Parser for Dotnet 4.0AudioWordsDownloader: AudioWordsDownloader 1.1 build 88: New features list of words (mp3 files) is available upon typing when a download path is defined list of download paths is added paths history settings added Bug fixed case mismatch in word search field fixed path not exist bug fixed when history has been used path, when filled from dialog, not stored refresh autocomplete list after path change word sought is deleted when path is changed at the end sought word list is deleted word list not refreshed download ends. word lis...Wsus Package Publisher: Release v1.3.1309.28: Fix a bug, where WPP crash when running on a computer where Windows was installed in another language than Fr, En or De, and launching the Update Creation Wizard. Fix a bug, where WPP crash if some Multi-Thread job are launch with more than 64 items. Add a button to abort "Install This Update" wizard. Allow WPP to remember which columns are shown last time. Make URL clickable on the Update Information Tab. Add a new feature, when Double-Clicking on an update, the default action exec...Tweetinvi a friendly Twitter C# API: Alpha 0.8.3.0: Version 0.8.3.0 emphasis on the FIlteredStream and ease how to manage Exceptions that can occur due to the network or any other issue you might encounter. Will be available through nuget the 29/09/2013. FilteredStream Features provided by the Twitter Stream API - Ability to track specific keywords - Ability to track specific users - Ability to track specific locations Additional features - Detect the reasons the tweet has been retrieved from the Filtered API. You have access to both the ma...WPF Extended DataGrid: WPF Extended DataGrid 2.0.0.4 binaries: Improved performance of GroupByAcDown?????: AcDown????? v4.5: ??●AcDown??????????、??、??、???????。????,????,?????????????????????????。???????????Acfun、????(Bilibili)、??、??、YouTube、??、???、??????、SF????、????????????。 ●??????AcPlay?????,??????、????????????????。 ● AcDown???????C#??,????.NET Framework 2.0??。?????"Acfun?????"。 ??v4.5 ???? AcPlay????????v3.5 ????????,???????????30% ?? ???????GoodManga.net???? ?? ?????????? ?? ??Acfun?????????? ??Bilibili??????????? ?????????flvcd???????? ??SfAcg????????????? ???????????? ???????????????? ????32...New ProjectsAnalysis Services Activity Viewer 2012: This project was created from a blog post I wrote back in February 2013 http://redphoenix.me/2013/08/22/upgrade-activity-viewer-2008-to-sql-server-2012/ARYSTA SYSTEM: Notebook System A/R Sales * Sales Order * Delivery * Sales Invoice * Sales Return * A/R Credit Memo * A/R Debit Memo A/P Purchasing * Purchase Order BEWELL SYSTEM: This system is design for Bewell-C Incorporated. Project Manager: Ben Penafiel System Engineer: Jhay Camba Report Designer: William Damasco Caching IOC: Caching IOC Container This will cache any Interface return results making it really easy to introduce caching to your solution.Grupo Anclita: Trabajo Final Laboratorio 4Halcyonic Skin by HTML5-UP - for DNN: This skin was converted for use in DNN by Michael Doxsey. Original HTML template designed and built by HTML5-UP: http://html5up.net/halcyonic/ ProPro: project about other projectsS3Unlock: Unlocks S3 agents that are stuck installing.sdfsdlfsdlkj01: dsfdffsdSerendipity - Responsive Skin for DNN: This is an HTML Template by Elemis, converted for use in DNN. Elemis URL: http://elemisfreebies.com/premium-themes/ Free for personal use and ed. purposes only.SmartSystemMenu: Smart system menu for you.SP 2013 Custom MultiTenant Adminstration: This Project creates a custom SP 2013 Tenant Admin site template covering limitations of existing tenant admin site.Telephasic Skin by HTML5-UP - for DNN: This skin was converted for use in DNN by Michael Doxsey. Original HTML template designed and built by HTML5-UP: http://html5up.net/telephasic/TelerikedIn: Social web app trying to look and feel like the famous LinkedInTerra 2: Generador de personajes para el juego de rol Terra2tsydev01: TextLineUpdateVds2465 Parser: This Project is about implementing a Parser for the Vds2465 protocol. It includes parsing and generating Vds2465 telegram bytes.Your Appliances: Empresa De ElectrodomesticosZeroFour by HTML5-UP - for DNN: This skin was converted for use in DNN by Michael Doxsey. Original HTML template designed and built by HTML5-UP: http://html5up.net/zerofour.

    Read the article

  • CodePlex Daily Summary for Wednesday, May 30, 2012

    CodePlex Daily Summary for Wednesday, May 30, 2012Popular ReleasesOMS.Ice - T4 Text Template Generator: OMS.Ice - T4 Text Template Generator v1.4.0.14110: Issue 601 - Template file name cannot contain characters that are not allowed in C#/VB identifiers Issue 625 - Last line will be ignored by the parser Issue 626 - Usage of environment variables and macrosSilverlight Toolkit: Silverlight 5 Toolkit Source - May 2012: Source code for December 2011 Silverlight 5 Toolkit release.totalem: version 2012.05.30.1: Beta version added function for mass renaming files and foldersAudio Pitch & Shift: Audio Pitch and Shift 4.4.0: Tracklist added on main window Improved performances with tracklist Some other fixesJson.NET: Json.NET 4.5 Release 6: New feature - Added IgnoreDataMemberAttribute support New feature - Added GetResolvedPropertyName to DefaultContractResolver New feature - Added CheckAdditionalContent to JsonSerializer Change - Metro build now always uses late bound reflection Change - JsonTextReader no longer returns no content after consecutive underlying content read failures Fix - Fixed bad JSON in an array with error handling creating an infinite loop Fix - Fixed deserializing objects with a non-default cons...DBScripterCmd - A command line tool to script database objects to seperate files: DBScripterCmd Source v1.0.2.zip: Add support for SQL Server 2005Indent Guides for Visual Studio: Indent Guides v12.1: Version History Changed in v12.1: Fixed crash when unable to start asynchronous analysis Fixed upgrade from v11 Changed in v12: background document analysis new options dialog with Quick Set selections for behavior new "glow" style for guides new menu icon in VS 11 preview control now uses editor theming highlighting can be customised on each line fixed issues with collapsed code blocks improved behaviour around left-aligned pragma/preprocessor commands (C#/C++) new setting...DotNetNuke® Community Edition CMS: 06.02.00: Major Highlights Fixed issue in the Site Settings when single quotes were being treated as escape characters Fixed issue loading the Mobile Premium Data after upgrading from CE to PE Fixed errors logged when updating folder provider settings Fixed the order of the mobile device capabilities in the Site Redirection Management UI The User Profile page was completely rebuilt. We needed User Profiles to have multiple child pages. This would allow for the most flexibility by still f...StarTrinity Face Recognition Library: Version 1.2: Much better accuracy????: ????2.0.1: 1、?????。WiX Toolset: WiX v3.6 RC: WiX v3.6 RC (3.6.2928.0) provides feature complete Burn with VS11 support. For more information see Rob's blog post about the release: http://robmensching.com/blog/posts/2012/5/28/WiX-v3.6-Release-Candidate-availableJavascript .NET: Javascript .NET v0.7: SetParameter() reverts to its old behaviour of allowing JavaScript code to add new properties to wrapped C# objects. The behavior added briefly in 0.6 (throws an exception) can be had via the new SetParameterOptions.RejectUnknownProperties. TerminateExecution now uses its isolate to terminate the correct context automatically. Added support for converting all C# integral types, decimal and enums to JavaScript numbers. (Previously only the common types were handled properly.) Bug fixe...callisto: callisto 2.0.29: Added DNS functionality to scripting. See documentation section for details of how to incorporate this into your scripts.Phalanger - The PHP Language Compiler for the .NET Framework: 3.0 (May 2012): Fixes: unserialize() of negative float numbers fix pcre possesive quantifiers and character class containing ()[] array deserilization when the array contains a reference to ISerializable parsing lambda function fix round() reimplemented as it is in PHP to avoid .NET rounding errors filesize bypass for FileInfo.Length bug in Mono New features: Time zones reimplemented, uses Windows/Linux databaseSharePoint Euro 2012 - UEFA European Football Predictor: havivi.euro2012.wsp (1.1): New fetures:Admin enable / disable match Hide/Show Euro 2012 SharePoint lists (3 lists) Installing SharePoint Euro 2012 PredictorSharePoint Euro 2012 Predictor has been developed as a SharePoint Sandbox solution to support SharePoint Online (Office 365) Download the solution havivi.euro2012.wsp from the download page: Downloads Upload this solution to your Site Collection via the solutions area. Click on Activate to make the web parts in the solution available for use in the Site C...????SDK for .Net 4.0+(OAuth2.0+??V2?API): ??V2?SDK???: ????SDK for .Net 4.X???????PHP?SDK???OAuth??API???Client???。 ??????API?? ???????OAuth2.0???? ???:????????,DEMO??AppKey????????????????,?????AppKey,????AppKey???????????,?????“????>????>????>??????”.Net Code Samples: Code Samples: Code samples (SLNs).LINQ_Koans: LinqKoans v.02: Cleaned up a bitCommonLibrary.NET: CommonLibrary.NET 0.9.8 - Final Release: A collection of very reusable code and components in C# 4.0 ranging from ActiveRecord, Csv, Command Line Parsing, Configuration, Holiday Calendars, Logging, Authentication, and much more. FluentscriptCommonLibrary.NET 0.9.8 contains a scripting language called FluentScript. Application: FluentScript Version: 0.9.8 Build: 0.9.8.4 Changeset: 75050 ( CommonLibrary.NET ) Release date: May 24, 2012 Binaries: CommonLibrary.dll Namespace: ComLib.Lang Project site: http://fluentscript.codeplex.com...JayData - The cross-platform HTML5 data-management library for JavaScript: JayData 1.0 RC1 Refresh 2: JayData is a unified data access library for JavaScript developers to query and update data from different sources like webSQL, indexedDB, OData, Facebook or YQL. See it in action in this 6 minutes video: http://www.youtube.com/watch?v=LlJHgj1y0CU RC1 R2 Release highlights Knockout.js integrationUsing the Knockout.js module, your UI can be automatically refreshed when the data model changes, so you can develop the front-end of your data manager app even faster. Querying 1:N relations in W...New Projects5Widgets: 5Widgets is a framework for building HTML5 canvas interfaces. Written in JavaScript, 5Widgets consists of a library of widgets and a controller that implements the MVC pattern. Though the HTML5 standard is gaining popularity, there is no framework like this at the moment. Yet, as a professional developer, I know many, including myself, would really find such a library useful. I have uploaded my initial code, which can definitely be improved since I have not had the time to work on it fu...Azure Trace Listener: Simple Trace Listener outputting trace data directly to Windows Azure Queue or Table Storage. Unlike the Windows Azure Diagnostics Listener (WAD), logging happens immediately and does not rely on (scheduled or manually triggered) Log Transfer mechanism. A simple Reader application shows how to read trace entries and can be used as is or as base for more advanced scenarios.CodeSample2012: Code Sample is a windows tool for saving pieces of codeEncryption: The goal of the Encryption project is to provide solid, high quality functionality that aims at taking the complexity out of using the System.Security.Cryptography namespace. The first pass of this library provides a very strong password encryption system. It uses variable length random salt bytes with secure SHA512 cryptographic hashing functions to allow you to provide a high level of security to the users. Entity Framework Code-First Automatic Database Migration: The Entity Framework Code-First Automatic Database Migration tool was designed to help developers easily update their database schema while preserving their data when they change their POCO objects. This is not meant to take the place of Code-First Migrations. This project is simply designed to ease the development burden of database changes. It grew out of the desire to not have to delete, recreated, and seed the database every time I made an object model change. Function Point Plugin: Function Point Tracability Mapper VSIX for Visual Studio 2010/TFS 2010+FunkOS: dcpu16 operating systemGit for WebMatrix: This is a WebMatrix Extension that allows users to access Git Source Control functions.Groupon Houses: the groupon site for housesLiquifier - Complete serialisation/deserialisation for complex object graphs: Liquifier is a serialisation/deserialisation library for preserving object graphs with references intact. Liquifier uses attributes and interfaces to allow the user to control how a type is serialised - the aim is to free the user from having to write code to serialise and deserialise objects, especially large or complex graphs in which this is a significant undertaking.MTACompCommEx032012: lak lak lakMVC Essentials: MVC Essentials is aimed to have all my learning in the projects that I have worked.MyWireProject: This project manages wireless networks.Peulot Heshbon - Hebrew: This program is for teaching young students math, until 6th grade. The program gives questions for the user. The user needs to answer the question. After 10 questions the user gets his mark. The marks are saved and can be viewed from every run. PlusOne: A .NET Extension and Utility Library: PlusOne is a library of extension and utility methods for C#.Project Support: This project is a simple project management solution. This will allow you to manage your clients, track bug reports, request additional features to projects that you are currently working on and much more. A client will be allowed to have multiple users, so that you can track who has made reports etc and provide them feedback. The solution is set-up so that if you require you can modify the styling to fit your companies needs with ease, you can even have multiple styles that can be set ...SharePoint 2010 Slide Menu Control: Navigation control for building SharePoint slide menuSIGO: 1 person following this project (follow) Projeto SiGO O Projeto SiGO (Sistema de Gerenciamento Odontologico) tem um Escorpo Complexo com varios programas e rotinas, compondo o modulo de SAC e CRM, Finanças, Estoque e outro itens. Coordenador Heitor F Neto : O Projeto SiGo desenvolvido aqui no CodePlex e open source, sera apenas um Prototipo, assim que desenvolmemos os modulos basicos iremos migrar para um servidor Pago e com segurança dos Codigo Fonte e Banco de Dados. Pessoa...STEM123: Windows Phone 7 application to help people find and create STEM Topic details.TIL: Text Injection and Templating Library: An advanced alternative to string.Format() that encapsulates templates in objects, uses named tokens, and lets you define extra serializing/processing steps before injecting input into the template (think, join an array, serialize an object, etc).UAH Exchange: Ukrainian hrivna currency exchangeuberBook: uberBook ist eine Kontakt-Verwaltung für´s Tray. Das Programm syncronisiert sämtliche Kontakte mit dem Internet und sucht automatisch nach Social-Network Profilen Ihrer KontakteWPF Animated GIF: A simple library to display animated GIF images in WPF, usable in XAML or in code.

    Read the article

< Previous Page | 9 10 11 12 13 14  | Next Page >