Search Results

Search found 134 results on 6 pages for 'prediction'.

Page 5/6 | < Previous Page | 1 2 3 4 5 6 | Next Page >

ArchBeat Link-o-Rama for 2012-08-29

- by Bob Rhubart

ORCLville: OOW 2012 - Crystal BallOracle ACE Director Floyd Teter cooks up some tongue-in-cheek predictions for news and announcements that might come out of Oracle OpenWorld 2012. What's your prediction? Oracle Optimized Solutions at Oracle OpenWorld 2012 | Oracle Hardware Hardware matters, too! The people behind the Oracle Hardware blog have put together a list of Oracle Openworld 2012 sessions focused Oracle Optimized Solutions, "designed, pre-tested, tuned and fully documented architectures for optimal performance and availability." Just plug the session ID numbers into Schedule Builder and you're good to go. AIX Checklist for stable OBIEE deployment | Dick Dunbar "OBIEE is a complicated system with many moving parts and connection points," according to Oracle Business Inteligence escalation engineer Dick Dunbar. "The purpose of this article is to provide a checklist to discuss OBIEE deployment with your systems administrators." Demo for OPN: Coherence Management with EM Cloud Control 12c Oracle Partner Network members can check out a new Coherence Management demo that showcases some of the key capabilities of Management Pack for Oracle Coherence and JVM Diagnostics. "The demo flow showcases the key enhancements made in Enterprise Manager 12c release which includes new customizable performance summary, cache data management and configuration management," according to the WebLogic Partner Community EMEA blog. The Pragmatic Architect: To Boldly Go Where No One Has Gone Before | Frank Buschmann "Many architects have technical knowledge that's both impressive and sound, which is indeed an inevitable basis for design success," says Frank Buschmann. "Yet, a lot of software projects fail or suffer due to severe challenges in their architecture. The key to mastery is how architects approach design, what they value, and where they focus their attention and work." As retail dies, whom will be the winners? | Peter Evans-Greenwood "The problem for many retailers is that how consumers shop has changed but the the retailers haven't adapted, " says Peter Evans-Greenwood. "Their sole virtue was to be the last step in a supply chain delivering somebody else's products to the consumer. However, being the last step in the supply chain is no longer a virtue when consumers skip across channels and can reach around the globe, no longer dependant on or limited to what they can find locally." Thought for the Day "Brains require stimulation. If you're locked into a pattern of work, work, and more work, your brain soon habituates - the same way that it lets you stop hearing a clock ticking. So, if you want to be more effective at work, you must, paradoxically, be less single-minded in your devotion to work. Anything you do—anything—that stimulates new segments of your brain will make you a more effective programmer or analyst. I promise, with a money-back guarantee." — Gerald M. Weinberg Source: SoftwareQuotes.com

Read the article
How to properly do weapon cool-down reload timer in multi-player laggy environment?

- by John Murdoch

I want to handle weapon cool-down timers in a fair and predictable way on both client on server. Situation: Multiple clients connected to server, which is doing hit detection / physics Clients have different latency for their connections to server ranging from 50ms to 500ms. They want to shoot weapons with fairly long reload/cool-down times (assume exactly 10 seconds) It is important that they get to shoot these weapons close to the cool-down time, as if some clients manage to shoot sooner than others (either because they are "early" or the others are "late") they gain a significant advantage. I need to show time remaining for reload on player's screen Clients can have clocks which are flat-out wrong (bad timezones, etc.) What I'm currently doing to deal with latency: Client collects server side state in a history, tagged with server timestamps Client assesses his time difference with server time: behindServerTimeNs = (behindServerTimeNs + (System.nanoTime() - receivedState.getServerTimeNs())) / 2 Client renders all state received from server 200 ms behind from his current time, adjusted by what he believes his time difference with server time is (whether due to wrong clocks, or lag). If he has server states on both sides of that calculated time, he (mostly LERP) interpolates between them, if not then he (LERP) extrapolates. No other client-side prediction of movement, e.g., to make his vehicle seem more responsive is done so far, but maybe will be added later So how do I properly add weapon reload timers? My first idea would be for the server to send each player the time when his reload will be done with each world state update, the client then adjusts it for the clock difference and thus can estimate when the reload will be finished in client-time (perhaps considering also for latency that the shoot message from client to server will take as well?), and if the user mashes the "shoot" button after (or perhaps even slightly before?) that time, send the shoot event. The server would get the shoot event and consider the time shot was made as the server time when it was received. It would then discard it if it is nowhere near reload time, execute it immediately if it is past reload time, and hold it for a few physics cycles until reload is done in case if it was received a bit early. It does all seem a bit convoluted, and I'm wondering whether it will work (e.g., whether it won't be the case that players with lower ping get better reload rates), and whether there are more elegant solutions to this problem.

Read the article
download and process a file by ftp at set intervals, with error handling, rescheduling and status messages

- by compound eye

I want to download a data file from a remote ftp server to my machine at regular intervals. Once the file is downloaded I want to call another script which will process the file. My development machine is mac os x, the eventual deployment environment is linux. What's would be the stock standard way to automate this? I know I can use cron to schedule curl to download and to run a script that will process the downloaded file at regular intervals, and I know could write a slightly more complex script or an application that would do this and add error handling, rescheduling and sending status emails. But one of my requirements for this project is to write as little custom code as possible, instead I should try to use standard, tried and true existing tools, and if I do have to write code, to try and write the most straightforward code possible. The reason for this is the code will potentially be installed on a large number of machines, all of which will need to be tweaked, customised and maintained by different people, long after I am gone from the project, so the intention is to use well documented, well supported tools as much as possible. This seems such a common task, there must be tools and scripts all over the internet, written by people who have carefully considered everything that could possibly go wrong when you need to download and process a file from a remote server at regular intervals, with error handling, rescheduling and sending status messages. Is that what Expect is for? What would you recommend? (the system will be downloading weather prediction data every six hours, so that the system can prepare in the event of bad weather warnings)

Read the article
Craig Mundie's video

- by GGBlogger

Timothy recently posted “Microsoft Shows Off Radical New UI, Could Be Used In Windows 8” on Slashdot. I took such grave exception to his post that I found it necessary to my senses to write this blog. We need to go back many years to the days of hand cranked calculators and early main frame computers. These devices had singular purposes – they were “number crunchers” used to make accounting easier. The front facing display in early mainframes was “blinken lights.” The calculators did provide printing – in the form of paper tape and the mainframes used line printers to generate reports as needed. We had other metaphors to work with. The typewriter was/is a mechanical device that substitutes for a type setting machine. The originals go back to 1867 and the keyboard layout has remained much the same to this day. In the earlier years the Morse code telegraphs gave way to Teletype machines. The old ASR33, seen on the left in this photo of one of the first computers I help manufacture, used a keyboard very similar to the keyboards in use today. It also generated punched paper tape that we generated to program this computer in machine language. Everything considered this computer which dates back to the late 1960s has a keyboard for input and a roll of paper as output. So in a very rudimentary fashion little has changed. Oh – we didn’t have a mouse! The entire point of this exercise is to point out that we still use very similar methods to get data into and out of a computer regardless of the operating system involved. The Altair, IMSAI, Apple, Commodore and onward to our modern machines changed the hardware that we interfaced to but changed little in the way we input, view and output the results of our computing effort. The mouse made some changes and the advent of windowed interfaces such as Windows and Apple made things somewhat easier for the user. My 4 year old granddaughter plays here Dora games on our computer. She knows how to start programs, use the mouse, play the game and is quite adept so we have come some distance in making computers useable. One of my chief bitches is the constant harangues leveled at Microsoft. Yup – they are a money making organization. You like Apple? No problem for me. I don’t use Apple mostly because I’m comfortable in the Windows environment but probably more because I don’t like Apple’s “Holier than thou” attitude. Some think they do superior things and that’s also fine with me. Obviously the iPhone has not done badly and other Apple products have fared well. But they are expensive. I just build a new machine with 4 Terabytes of storage, an Intel i7 Core 950 processor and 12 GB of RAMIII. It cost me – with dual monitors – less than 2000 dollars. Now to the chief reason for this blog. I’m going to continue developing software for as long as I’m able. For that reason I don’t see my keyboard, mouse and displays changing much for many years. I also don’t think Microsoft is going to spoil that for me by making radical changes to my developer experience. What Craig Mundie does in his video here: http://www.ispyce.com/2011/02/microsoft-shows-off-radical-new-ui.html is explore the potential future of computer interfaces for the masses of potential users. Using a computer today requires a person to have rudimentary capabilities with keyboards and the mouse. Wouldn’t it be great if all they needed was hand gestures? Although not mentioned it would also be nice if computers responded intelligently to a user’s voice. There is absolutely no argument with the fact that user interaction with these machines is going to change over time. My personal prediction is that it will take years for much of what Craig discusses to come to a cost effective reality but it is certainly coming. I just don’t believe that what Craig discusses will be the future look of a Window 8.

Read the article
SQL SERVER – Simple Demo of New Cardinality Estimation Features of SQL Server 2014

- by Pinal Dave

SQL Server 2014 has new cardinality estimation logic/algorithm. The cardinality estimation logic is responsible for quality of query plans and majorly responsible for improving performance for any query. This logic was not updated for quite a while, but in the latest version of SQL Server 2104 this logic is re-designed. The new logic now incorporates various assumptions and algorithms of OLTP and warehousing workload. Cardinality estimates are a prediction of the number of rows in the query result. The query optimizer uses these estimates to choose a plan for executing the query. The quality of the query plan has a direct impact on improving query performance. ~ Souce MSDN Let us see a quick example of how cardinality improves performance for a query. I will be using the AdventureWorks database for my example. Before we start with this demonstration, remember that even though you have SQL Server 2014 to see the effect of new cardinality estimates, you will need your database compatibility mode set to 120 which is for SQL Server 2014. If your server instance of SQL Server 2014 but you have set up your database compatibility mode to 110 or any other earlier version, you will get performance from your query like older version of SQL Server. Now we will execute following query in two different compatibility mode and see its performance. (Note that my SQL Server instance is of version 2014). USE AdventureWorks2014 GO -- ------------------------------- -- NEW Cardinality Estimation ALTER DATABASE AdventureWorks2014 SET COMPATIBILITY_LEVEL = 120 GO EXEC [dbo].[uspGetManagerEmployees] 44 GO -- ------------------------------- -- Old Cardinality Estimation ALTER DATABASE AdventureWorks2014 SET COMPATIBILITY_LEVEL = 110 GO EXEC [dbo].[uspGetManagerEmployees] 44 GO Result of Statistics IO Compatibility level 120 Table ‘Person’. Scan count 0, logical reads 6, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table ‘Employee’. Scan count 2, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table ‘Worktable’. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table ‘Worktable’. Scan count 2, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Compatibility level 110 Table ‘Worktable’. Scan count 2, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table ‘Person’. Scan count 0, logical reads 137, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table ‘Employee’. Scan count 2, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table ‘Worktable’. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. You will notice in the case of compatibility level 110 there 137 logical read from table person where as in the case of compatibility level 120 there are only 6 physical reads from table person. This drastically improves the performance of the query. If we enable execution plan, we can see the same as well. I hope you will find this quick example helpful. You can read more about this in my latest Pluralsight Course. Reference: Pinal Dave (http://blog.SQLAuthority.com)Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
OpenWorld: Our (Road) Maps are Looking Good!

- by Tony Berk

Wow, only one (or two) days down at Oracle OpenWorld! Are you on overload yet? I'm still trying to figure out how to be in 3 sessions at the same time... I guess everyone needs to prioritize! There was a lot to see in Monday's sessions, especially some great forward-looking roadmap sessions. In case you aren't here or you decided to go to other sessions, this is my quick summary of what I could capture from a couple of the roadmaps: In the Fusion CRM Strategy and Roadmap session, Anthony Lye provided an overview of the Fusion CRM strategy including the key design principles of 3 E's: Easy, Effective and Efficient. After an overview of how Oracle has deployed Fusion CRM internally to 25,000 users worldwide, Anthony discussed the features coming in the next release, the releases in the next 12 months and beyond. I can't detail too much since you haven't read Oracle's Safe Harbor statement, but check out Fusion Tap and look for new features and added functionality for sales prediction, marketing, social and integration with a number of the key Customer Experience products. In the Oracle RightNow CX Cloud Service Vision and Roadmap session, Chris Hamilton presented the focus areas for the RightNow product. As a result of the large increase in development resources after the acquisition, the RightNow CX team is planning a lot of enhancements to the functionality, infrastructure and integrations. As a key piece of the Oracle Customer Experience (CX) strategy, RightNow will be integrated with Oracle Social Network, Oracle Commerce (ATG and Endeca), Oracle Knowledge, Oracle Policy Automation and, of course, further integration with Fusion Sales and Marketing. Look forward to seeing more on the Virtual Assistant, Smart Interaction Hub and Mobility. In addition to the roadmaps, I was looking forward to hearing from Oracle CRM customers. So, I sat in on two great Siebel customer panels: The Maximizing User Adoption Rates for Siebel Sales and Siebel Partner Relationship Management panel consisted of speakers from CSL Behring, McKesson and Intuit. It was great to get an overview of implementations for both B2B and B2C companies. It was great hearing that all of these companies have more than 1,000 sales users (Intuit has 4,000) and how the 360 degree view of the customer in Siebel is helping these customers improve their customers' experience (CX). They are all great examples of centralized implementations which have standardized processes across the globe and across business units. Waste Management, Farmers Insurance and the US Citizenship & Immigration Services presented in the Driving Great Customer Experiences with Siebel Service Applications session. Talk about serving large customer bases! Is it possible that Farmers with only 10 million households is the smallest of these 3? All of them provided great examples of how they are improving the customer experience (CX) including 60-70% improvements in efficiency or reducing the number of applications the customer service reps (CSRs) need to use from 10 to 1 (Waste Management) and context aware call transfers to avoid the caller explaining their issue 3 times (USCIS). So that's my wrap up of only 4 sessions from Monday. In between sessions, I stopped by the Oracle DEMOgrounds and CRM Pavilion to visit with a group of great partners and see the products and partner integrations in action. Don't miss a recap of Mark Hurd's Keynote. I can't believe there were another 40+ sessions covering CRM, Fusion, Cloud, etc. that I missed today! Anyone else see any great sessions?

Read the article
How can I apply a PSSM efficiently?

- by flies

I am fitting for position specific scoring matrices (PSSM aka Position Specific Weight Matrices). The fit I'm using is like simulated annealing, where I the perturb the PSSM, compare the prediction to experiment and accept the change if it improves agreement. This means I apply the PSSM millions of times per fit; performance is critical. In my particular problem, I'm applying a PSSM for an object of length L (~8 bp) at every position of a DNA sequence of length M (~30 bp) (so there are M-L+1 valid positions). I need an efficient algorithm to apply a PSSM. Can anyone help improve performance? My best idea is to convert the DNA into some kind of a matrix so that applying the PSSM is matrix multiplication. There are efficient linear algebra libraries out there (e.g. BLAS), but I'm not sure how best to turn an M-length DNA sequence into a matrix M x 4 matrix and then apply the PSSM at each position. The solution needs to work for higher order/dinucleotide terms in the PSSM - presumably this means representing the sequence-matrix for mono-nucleotides and separately for dinucleotides. My current solution iterates over each position m, then over each letter in word from m to m+L-1, adding the corresponding term in the matrix. I'm storing the matrix as a multi-dimensional STL vector, and profiling has revealed that a lot of the computation time is just accessing the elements of the PSSM (with similar performance bottlenecks accessing the DNA sequence). If someone has an idea besides matrix multiplication, I'm all ears.

Read the article
How to generate a monotone MART ROC in R?

- by user1521587

I am using R and applying MART (Alg. for multiple additive regression trees) on a training set to build prediction models. When I look at the ROC curve, it is not monotone. I would be grateful if someone can help me with how I should fix this. I am guessing the issue is that initially, MART generates n trees and if these trees are not the same for all the models I am building, the results will not be comparable. Here are the steps I take: 1) Fix the false-negative cost, c_fn. Let cost = c(0, 1, c_fn, 0). 2) use the following line to build the mart model: mart(x, y, lx, martmode='class', niter=2000, cost.mtx=cost) where x is the matrix of training set variables, y is the observation matrix, lx is the matrix which specifies which of the variables in x is numerical, which one categorical. 3) I predict the test set observations using the mart model found in step 2 using this line: y_pred = martpred(x_test, probs=T) 4) I compute the false-positive and false-negative errors as follows: t = 1/(1+c_fn) %threshold based on Bayes optimal rule where c_fp=1 and c_fn. p_0 = length(which(y_test==1))/dim(y_test)[1] p_01 = sum(1*(y_pred[,2]t & y_test==0))/dim(y_test)[1] p_11 = sum(1*(y_pred[,2]t & y_test==1))/dim(y_test)[1] p_fp = p_01/(1-p_0) p_tp = p_11/p_0 5) repeat step 1-4 for a new false-negative cost.

Read the article
regressions with many nested categorical covariates

- by eric

I have a few hundred thousand measurements where the dependent variable is a probability, and would like to use logistic regression. However, the covariates I have are all categorical, and worse, are all nested. By this I mean that if a certain measurement has "city - Phoenix" then obviously it is certain to have "state - Arizona" and "country - U.S." I have four such factors - the most granular has some 20k levels, but if need be I could do without that one, I think. I also have a few non-nested categorical covariates (only four or so, with maybe three different levels each). What I am most interested in is prediction - given a new observation in some city, I would like to know the relevant probability/dependent variable. I am not interested as much in the related inferential machinery - standard deviations, etc - at least as of now. I am hoping I can afford to be sloppy. However, I would love to have that information unless it requires methods that are more computationally expensive. Does anyone have any advice on how to attack this? I have looked into mixed effects, but am not sure it is what I am looking for.

Read the article
Big Data – Role of Cloud Computing in Big Data – Day 11 of 21

- by Pinal Dave

In yesterday’s blog post we learned the importance of the NewSQL. In this article we will understand the role of Cloud in Big Data Story What is Cloud? Cloud is the biggest buzzword around from last few years. Everyone knows about the Cloud and it is extremely well defined online. In this article we will discuss cloud in the context of the Big Data. Cloud computing is a method of providing a shared computing resources to the application which requires dynamic resources. These resources include applications, computing, storage, networking, development and various deployment platforms. The fundamentals of the cloud computing are that it shares pretty much share all the resources and deliver to end users as a service. Examples of the Cloud Computing and Big Data are Google and Amazon.com. Both have fantastic Big Data offering with the help of the cloud. We will discuss this later in this blog post. There are two different Cloud Deployment Models: 1) The Public Cloud and 2) The Private Cloud Public Cloud Public Cloud is the cloud infrastructure build by commercial providers (Amazon, Rackspace etc.) creates a highly scalable data center that hides the complex infrastructure from the consumer and provides various services. Private Cloud Private Cloud is the cloud infrastructure build by a single organization where they are managing highly scalable data center internally. Here is the quick comparison between Public Cloud and Private Cloud from Wikipedia: Public Cloud Private Cloud Initial cost Typically zero Typically high Running cost Unpredictable Unpredictable Customization Impossible Possible Privacy No (Host has access to the data Yes Single sign-on Impossible Possible Scaling up Easy while within defined limits Laborious but no limits Hybrid Cloud Hybrid Cloud is the cloud infrastructure build with the composition of two or more clouds like public and private cloud. Hybrid cloud gives best of the both the world as it combines multiple cloud deployment models together. Cloud and Big Data – Common Characteristics There are many characteristics of the Cloud Architecture and Cloud Computing which are also essentially important for Big Data as well. They highly overlap and at many places it just makes sense to use the power of both the architecture and build a highly scalable framework. Here is the list of all the characteristics of cloud computing important in Big Data Scalability Elasticity Ad-hoc Resource Pooling Low Cost to Setup Infastructure Pay on Use or Pay as you Go Highly Available Leading Big Data Cloud Providers There are many players in Big Data Cloud but we will list a few of the known players in this list. Amazon Amazon is arguably the most popular Infrastructure as a Service (IaaS) provider. The history of how Amazon started in this business is very interesting. They started out with a massive infrastructure to support their own business. Gradually they figured out that their own resources are underutilized most of the time. They decided to get the maximum out of the resources they have and hence they launched their Amazon Elastic Compute Cloud (Amazon EC2) service in 2006. Their products have evolved a lot recently and now it is one of their primary business besides their retail selling. Amazon also offers Big Data services understand Amazon Web Services. Here is the list of the included services: Amazon Elastic MapReduce – It processes very high volumes of data Amazon DynammoDB – It is fully managed NoSQL (Not Only SQL) database service Amazon Simple Storage Services (S3) – A web-scale service designed to store and accommodate any amount of data Amazon High Performance Computing – It provides low-tenancy tuned high performance computing cluster Amazon RedShift – It is petabyte scale data warehousing service Google Though Google is known for Search Engine, we all know that it is much more than that. Google Compute Engine – It offers secure, flexible computing from energy efficient data centers Google Big Query – It allows SQL-like queries to run against large datasets Google Prediction API – It is a cloud based machine learning tool Other Players Besides Amazon and Google we also have other players in the Big Data market as well. Microsoft is also attempting Big Data with the Cloud with Microsoft Azure. Additionally Rackspace and NASA together have initiated OpenStack. The goal of Openstack is to provide a massively scaled, multitenant cloud that can run on any hardware. Thing to Watch The cloud based solutions provides a great integration with the Big Data’s story as well it is very economical to implement as well. However, there are few things one should be very careful when deploying Big Data on cloud solutions. Here is a list of a few things to watch: Data Integrity Initial Cost Recurring Cost Performance Data Access Security Location Compliance Every company have different approaches to Big Data and have different rules and regulations. Based on various factors, one can implement their own custom Big Data solution on a cloud. Tomorrow In tomorrow’s blog post we will discuss about various Operational Databases supporting Big Data. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
Red Gate in the Community

- by Nick Harrison

Much has been said recently about Red Gate's community involvement and commitment to the DotNet community. Much of this has been unduly negative. Before you start throwing stones and spewing obscenities, consider some additional facts: Red Gate's software is actually very good. I have worked on many projects where Red Gate's software was instrumental in finishing successfully. Red Gate is VERY good to the community. I have spoken at many user groups and code camps where Red Gate has been a sponsor. Red Gate consistently offers up money to pay for the venue or food, and they will often give away licenses as door prizes. There are many such community events that would not take place without Red Gate's support. All I have ever seen them ask for is to have their products mentioned or be listed as a sponsor. They don't insist on anyone following a specific script. They don't monitor how their products are showcased. They let their products speak for themselves. Red Gate sponsors the Simple Talk web site. I publish there regularly. Red Gate has never exerted editorial pressure on me. No one has ever told me we can't publish this unless you mention Red Gate products. No one has ever said, you need to say nice things about Red Gate products in order to be published. They have told me, "you need to make this less academic, so you don't alienate too many readers. "You need to actually write an introduction so people will know what you are talking about". "You need to write this so that someone who isn't a reflection nut will follow what you are trying to say." In short, they have been good editors worried about the quality of the content and what the readers are likely to be interested in. For me personally, Red Gate and Simple Talk have both been excellent to work with. As for the developer outrage… I am a little embarrassed by so much of the response that I am seeing. So much of the complaints remind me of little children whining "but you promised" Semantics aside. A promise is just a promise. It's not like they "pinky sweared". Sadly no amount name calling or "double dog daring" will change the economics of the situation. Red Gate is not a multibillion dollar corporation. They are a mid size company doing the best they can. Without a doubt, their pockets are not as deep as Microsoft's. I honestly believe that they did try to make the "freemium" model work. Sadly it did not. I have no doubt that they intended for it to work and that they tried to make it work. I also have no doubt that they labored over making this decision. This could not have been an easy decision to make. Many people are gleefully proclaiming a massive backlash against Red Gate swearing off their wonderful products and promising to bash them at every opportunity from now on. This is childish behavior that does not represent professionals. This type of behavior is more in line with bullies in the school yard than professionals in a professional community. Now for my own prediction… This back lash against Red Gate is not likely to last very long. We will all realize that we still need their products. We may look around for alternatives, but realize that they really do have the best in class for every product that they produce, and that they really are not exorbitantly priced. We will see them sponsoring Code Camps and User Groups and be reminded, "hey this isn't such a bad company". On the other hand, software shops like Red Gate, will remember this back lash and give a second thought to supporting open source projects. They will worry about getting involved when an individual wants to turn over control for a product that they developed but can no longer support alone. Who wants to run the risk of not being able to follow through on their best intentions. In the end we may all suffer, even the toddlers among us throwing the temper tantrum, "BUT YOU PROMISED!" Disclaimer Before anyone asks or jumps to conclusions, I do not get paid by Red Gate to say any of this. I have often written about their products, and I have long thought that they are a wonderful company with amazing products. If they ever open an office in the SE United States, I will be one of the first to apply.

Read the article
Solaris 11.1 changes building of code past the point of __NORETURN

- by alanc

While Solaris 11.1 was under development, we started seeing some errors in the builds of the upstream X.Org git master sources, such as: "Display.c", line 65: Function has no return statement : x_io_error_handler "hostx.c", line 341: Function has no return statement : x_io_error_handler from functions that were defined to match a specific callback definition that declared them as returning an int if they did return, but these were calling exit() instead of returning so hadn't listed a return value. These had been generating warnings for years which we'd been ignoring, but X.Org has made enough progress in cleaning up code for compiler warnings and static analysis issues lately, that the community turned up the default error levels, including the gcc flag -Werror=return-type and the equivalent Solaris Studio cc flags -v -errwarn=E_FUNC_HAS_NO_RETURN_STMT, so now these became errors that stopped the build. Yet on Solaris, gcc built this code fine, while Studio errored out. Investigation showed this was due to the Solaris headers, which during Solaris 10 development added a number of annotations to the headers when gcc was being used for the amd64 kernel bringup before the Studio amd64 port was ready. Since Studio did not support the inline form of these annotations at the time, but instead used #pragma for them, the definitions were only present for gcc. To resolve this, I fixed both sides of the problem, so that it would work for building new X.Org sources on older Solaris releases or with older Studio compilers, as well as fixing the general problem before it broke more software building on Solaris. To the X.Org sources, I added the traditional Studio #pragma does_not_return to recognize that functions like exit() don't ever return, in patches such as this Xserver patch. Adding a dummy return statement was ruled out as that introduced unreachable code errors from compilers and analyzers that correctly realized you couldn't reach that code after a return statement. And on the Solaris 11.1 side, I updated the annotation definitions in <sys/ccompile.h> to enable for Studio 12.0 and later compilers the annotations already existing in a number of system headers for functions like exit() and abort(). If you look in that file you'll see the annotations we currently use, though the forms there haven't gone through review to become a Committed interface, so may change in the future. Actually getting this integrated into Solaris though took a bit more work than just editing one header file. Our ELF binary build comparison tool, wsdiff, actually showed a large number of differences in the resulting binaries due to the compiler using this information for branch prediction, code path analysis, and other possible optimizations, so after comparing enough of the disassembly output to be comfortable with the changes, we also made sure to get this in early enough in the release cycle so that it would get plenty of test exposure before the release. It also required updating quite a bit of code to avoid introducing new lint or compiler warnings or errors, and people building applications on top of Solaris 11.1 and later may need to make similar changes if they want to keep their build logs similarly clean. Previously, if you had a function that was declared with a non-void return type, lint and cc would warn if you didn't return a value, even if you called a function like exit() or panic() that ended execution. For instance: #include <stdlib.h> int callback(int status) { if (status == 0) return status; exit(status); } would previously require a never executed return 0; after the exit() to avoid lint warning "function falls off bottom without returning value". Now the compiler & lint will both issue "statement not reached" warnings for a return 0; after the final exit(), allowing (or in some cases, requiring) it to be removed. However, if there is no return statement anywhere in the function, lint will warn that you've declared a function returning a value that never does so, suggesting you can declare it as void. Unfortunately, if your function signature is required to match a certain form, such as in a callback, you not be able to do so, and will need to add a /* LINTED */ to the end of the function. If you need your code to build on both a newer and an older release, then you will either need to #ifdef these unreachable statements, or, to keep your sources common across releases, add to your sources the corresponding #pragma recognized by both current and older compiler versions, such as: #pragma does_not_return(exit) #pragma does_not_return(panic) Hopefully this little extra work is paid for by the compilers & code analyzers being able to better understand your code paths, giving you better optimizations and more accurate errors & warning messages.

Read the article
Please give the solution of the following programs in R Programming

- by NEETHU

Table below gives data concerning the performance of 28 national football league teams in 1976.It is suspected that the no. of yards gained rushing by opponents(x8) has an effect on the no. of games won by a team(y) (a)Fit a simple linear regression model relating games won by y to yards gained rushing by opponents x8. (b)Construct the analysis of variance table and test for significance of regression. (c)Find a 95% CI on the slope. (d)What percent of the total variability in y is explained by this model. (e)Find a 95% CI on the mean number of games won in opponents yards rushing is limited to 2000 yards. Team y x8 1 10 2205 2 11 2096 3 11 1847 4 13 1803 5 10 1457 6 11 1848 7 10 1564 8 11 1821 9 4 2577 10 2 2476 11 7 1984 12 10 1917 13 9 1761 14 9 1709 15 6 1901 16 5 2288 17 5 2072 18 5 2861 19 6 2411 20 4 2289 21 3 2203 22 3 2592 23 4 2053 24 10 1979 25 6 2048 26 8 1786 27 2 2876 28 0 2560 Suppose we would like to use the model developed in problem 1 to predict the no. of games a team will win if it can limit opponents yards rushing to 1800 yards. Find a point estimate of the no. of games won when x8=1800.Find a 905 prediction interval on the no. of games won. The purity of Oxygen produced by a fractionation process is thought to be percentage of Hydrocarbon in the main condenser of the processing unit .20 samples are shown below. Purity(%) Hydrocarbon(%) 86.91 1.02 89.85 1.11 90.28 1.43 86.34 1.11 92.58 1.01 87.33 0.95 86.29 1.11 91.86 0.87 95.61 1.43 89.86 1.02 96.73 1.46 99.42 1.55 98.66 1.55 96.07 1.55 93.65 1.4 87.31 1.15 95 1.01 96.85 0.99 85.2 0.95 90.56 0.98 (a)Fit a simple linear regression model to the data. (b)Test the hypothesis H0:ß=0 (c)Calculate R2 . (d)Find a 95% CI on the slope. (e)Find a 95% CI on the mean purity and the Hydrocarbon % is 1. Consider the Oxygen plant data in Problem3 and assume that purity and Hydrocarbon percentage are jointly normally distributed r.vs (a)What is the correlation between Oxygen purity and Hydrocarbon% (b)Test the hypothesis that ?=0. (c)Construct a 95% CI for ?.

Read the article
Keeping files or database records? Java and Python

- by danpalmer

My website will use a Neural Network to predict thing based on user data. The user can select the data to be used in training the network and then use their trained network to predict things. I am using a framework to create, train and query the networks. This uses Java. The framework has persistence for saving a network to an XML file. What is the best way to store these files? I can see several potential ideas, but I need help on choosing which is best: Save each network to a separate XML file with a name that is stored in the database. Load this each time. Save all the networks to the same XML file with each network having a different name that is stored in the database. Somehow pass what would normally be written to an XML file to the Django site for writing to the database. This would need to be returned to the Java code when a prediction needs to be made. I am able to do 1 or 2, but I think their performance will be quite limited and I am on shared hosting at the moment, so I don't know how pleased they would be with thousands of files. Also, after adding a few thousand records to one XML file, I was noticing a massive performance hit on saving to it. If I were able to implement version 3 somehow I think it would be best. No issues with separate processes accessing the database and I think performance would be better. Not to mention having no files lying around. However, the stuff in the neural network framework I am using (Encog) for saving to a file needs access to a Java file object, not a string that could be saved to a database. Unless there is some Java magic I can do here (I know very little Java), the only way I can see of doing this would be with a temporary files but I don't know if this is the correct way to do it. I would appreciate any ideas on the best way to implement any of the above 3 ideas or any alternatives. Thanks!

Read the article
A Guided Tour of Complexity

- by JoshReuben

I just re-read Complexity – A Guided Tour by Melanie Mitchell , protégé of Douglas Hofstadter ( author of “Gödel, Escher, Bach”) http://www.amazon.com/Complexity-Guided-Tour-Melanie-Mitchell/dp/0199798109/ref=sr_1_1?ie=UTF8&qid=1339744329&sr=8-1 here are some notes and links: Evolved from Cybernetics, General Systems Theory, Synergetics some interesting transdisciplinary fields to investigate: Chaos Theory - http://en.wikipedia.org/wiki/Chaos_theory – small differences in initial conditions (such as those due to rounding errors in numerical computation) yield widely diverging outcomes for chaotic systems, rendering long-term prediction impossible. System Dynamics / Cybernetics - http://en.wikipedia.org/wiki/System_Dynamics – study of how feedback changes system behavior Network Theory - http://en.wikipedia.org/wiki/Network_theory – leverage Graph Theory to analyze symmetric / asymmetric relations between discrete objects Algebraic Topology - http://en.wikipedia.org/wiki/Algebraic_topology – leverage abstract algebra to analyze topological spaces There are limits to deterministic systems & to computation. Chaos Theory definitely applies to training an ANN (artificial neural network) – different weights will emerge depending upon the random selection of the training set. In recursive Non-Linear systems http://en.wikipedia.org/wiki/Nonlinear_system – output is not directly inferable from input. E.g. a Logistic map: Xt+1 = R Xt(1-Xt) Different types of bifurcations, attractor states and oscillations may occur – e.g. a Lorenz Attractor http://en.wikipedia.org/wiki/Lorenz_system Feigenbaum Constants http://en.wikipedia.org/wiki/Feigenbaum_constants express ratios in a bifurcation diagram for a non-linear map – the convergent limit of R (the rate of period-doubling bifurcations) is 4.6692016 Maxwell’s Demon - http://en.wikipedia.org/wiki/Maxwell%27s_demon - the Second Law of Thermodynamics has only a statistical certainty – the universe (and thus information) tends towards entropy. While any computation can theoretically be done without expending energy, with finite memory, the act of erasing memory is permanent and increases entropy. Life & thought is a counter-example to the universe’s tendency towards entropy. Leo Szilard and later Claude Shannon came up with the Information Theory of Entropy - http://en.wikipedia.org/wiki/Entropy_(information_theory) whereby Shannon entropy quantifies the expected value of a message’s information in bits in order to determine channel capacity and leverage Coding Theory (compression analysis). Ludwig Boltzmann came up with Statistical Mechanics - http://en.wikipedia.org/wiki/Statistical_mechanics – whereby our Newtonian perception of continuous reality is a probabilistic and statistical aggregate of many discrete quantum microstates. This is relevant for Quantum Information Theory http://en.wikipedia.org/wiki/Quantum_information and the Physics of Information - http://en.wikipedia.org/wiki/Physical_information. Hilbert’s Problems http://en.wikipedia.org/wiki/Hilbert's_problems pondered whether mathematics is complete, consistent, and decidable (the Decision Problem – http://en.wikipedia.org/wiki/Entscheidungsproblem – is there always an algorithm that can determine whether a statement is true). Godel’s Incompleteness Theorems http://en.wikipedia.org/wiki/G%C3%B6del's_incompleteness_theorems proved that mathematics cannot be both complete and consistent (e.g. “This statement is not provable”). Turing through the use of Turing Machines (http://en.wikipedia.org/wiki/Turing_machine symbol processors that can prove mathematical statements) and Universal Turing Machines (http://en.wikipedia.org/wiki/Universal_Turing_machine Turing Machines that can emulate other any Turing Machine via accepting programs as well as data as input symbols) that computation is limited by demonstrating the Halting Problem http://en.wikipedia.org/wiki/Halting_problem (is is not possible to know when a program will complete – you cannot build an infinite loop detector). You may be used to thinking of 1 / 2 / 3 dimensional systems, but Fractal http://en.wikipedia.org/wiki/Fractal systems are defined by self-similarity & have non-integer Hausdorff Dimensions !!! http://en.wikipedia.org/wiki/List_of_fractals_by_Hausdorff_dimension – the fractal dimension quantifies the number of copies of a self similar object at each level of detail – eg Koch Snowflake - http://en.wikipedia.org/wiki/Koch_snowflake Definitions of complexity: size, Shannon entropy, Algorithmic Information Content (http://en.wikipedia.org/wiki/Algorithmic_information_theory - size of shortest program that can generate a description of an object) Logical depth (amount of info processed), thermodynamic depth (resources required). Complexity is statistical and fractal. John Von Neumann’s other machine was the Self-Reproducing Automaton http://en.wikipedia.org/wiki/Self-replicating_machine . Cellular Automata http://en.wikipedia.org/wiki/Cellular_automaton are alternative form of Universal Turing machine to traditional Von Neumann machines where grid cells are locally synchronized with their neighbors according to a rule. Conway’s Game of Life http://en.wikipedia.org/wiki/Conway's_Game_of_Life demonstrates various emergent constructs such as “Glider Guns” and “Spaceships”. Cellular Automatons are not practical because logical ops require a large number of cells – wasteful & inefficient. There are no compilers or general program languages available for Cellular Automatons (as far as I am aware). Random Boolean Networks http://en.wikipedia.org/wiki/Boolean_network are extensions of cellular automata where nodes are connected at random (not to spatial neighbors) and each node has its own rule –> they demonstrate the emergence of complex & self organized behavior. Stephen Wolfram’s (creator of Mathematica, so give him the benefit of the doubt) New Kind of Science http://en.wikipedia.org/wiki/A_New_Kind_of_Science proposes the universe may be a discrete Finite State Automata http://en.wikipedia.org/wiki/Finite-state_machine whereby reality emerges from simple rules. I am 2/3 through this book. It is feasible that the universe is quantum discrete at the plank scale and that it computes itself – Digital Physics: http://en.wikipedia.org/wiki/Digital_physics – a simulated reality? Anyway, all behavior is supposedly derived from simple algorithmic rules & falls into 4 patterns: uniform , nested / cyclical, random (Rule 30 http://en.wikipedia.org/wiki/Rule_30) & mixed (Rule 110 - http://en.wikipedia.org/wiki/Rule_110 localized structures – it is this that is interesting). interaction between colliding propagating signal inputs is then information processing. Wolfram proposes the Principle of Computational Equivalence - http://mathworld.wolfram.com/PrincipleofComputationalEquivalence.html - all processes that are not obviously simple can be viewed as computations of equivalent sophistication. Meaning in information may emerge from analogy & conceptual slippages – see the CopyCat program: http://cognitrn.psych.indiana.edu/rgoldsto/courses/concepts/copycat.pdf Scale Free Networks http://en.wikipedia.org/wiki/Scale-free_network have a distribution governed by a Power Law (http://en.wikipedia.org/wiki/Power_law - much more common than Normal Distribution). They are characterized by hubs (resilience to random deletion of nodes), heterogeneity of degree values, self similarity, & small world structure. They grow via preferential attachment http://en.wikipedia.org/wiki/Preferential_attachment – tipping points triggered by positive feedback loops. 2 theories of cascading system failures in complex systems are Self-Organized Criticality http://en.wikipedia.org/wiki/Self-organized_criticality and Highly Optimized Tolerance http://en.wikipedia.org/wiki/Highly_optimized_tolerance. Computational Mechanics http://en.wikipedia.org/wiki/Computational_mechanics – use of computational methods to study phenomena governed by the principles of mechanics. This book is a great intuition pump, but does not cover the more mathematical subject of Computational Complexity Theory – http://en.wikipedia.org/wiki/Computational_complexity_theory I am currently reading this book on this subject: http://www.amazon.com/Computational-Complexity-Christos-H-Papadimitriou/dp/0201530821/ref=pd_sim_b_1 stay tuned for that review!

Read the article
Big Data’s Killer App…

- by jean-pierre.dijcks

Recently Keith spent some time talking about the cloud on this blog and I will spare you my thoughts on the whole thing. What I do want to write down is something about the Big Data movement and what I think is the killer app for Big Data... Where is this coming from, ok, I confess... I spent 3 days in cloud land at the Cloud Connect conference in Santa Clara and it was quite a lot of fun. One of the nice things at Cloud Connect was that there was a track dedicated to Big Data, which prompted me to some extend to write this post. What is Big Data anyways? The most valuable point made in the Big Data track was that Big Data in itself is not very cool. Doing something with Big Data is what makes all of this cool and interesting to a business user! The other good insight I got was that a lot of people think Big Data means a single gigantic monolithic system holding gazillions of bytes or documents or log files. Well turns out that most people in the Big Data track are talking about a lot of collections of smaller data sets. So rather than thinking "big = monolithic" you should be thinking "big = many data sets". This is more than just theoretical, it is actually relevant when thinking about big data and how to process it. It is important because it means that the platform that stores data will most likely consist out of multiple solutions. You may be storing logs on something like HDFS, you may store your customer information in Oracle and you may store distilled clickstream information in some distilled form in MySQL. The big question you will need to solve is not what lives where, but how to get it all together and get some value out of all that data. NoSQL and MapReduce Nope, sorry, this is not the killer app... and no I'm not saying this because my business card says Oracle and I'm therefore biased. I think language is important, but as with storage I think pragmatic is better. In other words, some questions can be answered with SQL very efficiently, others can be answered with PERL or TCL others with MR. History should teach us that anyone trying to solve a problem will use any and all tools around. For example, most data warehouses (Big Data 1.0?) get a lot of data in flat files. Everyone then runs a bunch of shell scripts to massage or verify those files and then shoves those files into the database. We've even built shell script support into external tables to allow for this. I think the Big Data projects will do the same. Some people will use MapReduce, although I would argue that things like Cascading are more interesting, some people will use Java. Some data is stored on HDFS making Cascading the way to go, some data is stored in Oracle and SQL does do a good job there. As with storage and with history, be pragmatic and use what fits and neither NoSQL nor MR will be the one and only. Also, a language, while important, does in itself not deliver business value. So while cool it is not a killer app... Vertical Behavioral Analytics This is the killer app! And you are now thinking: "what does that mean?" Let's decompose that heading. First of all, analytics. I would think you had guessed by now that this is really what I'm after, and of course you are right. But not just analytics, which has a very large scope and means many things to many people. I'm not just after Business Intelligence (analytics 1.0?) or data mining (analytics 2.0?) but I'm after something more interesting that you can only do after collecting large volumes of specific data. That all important data is about behavior. What do my customers do? More importantly why do they behave like that? If you can figure that out, you can tailor web sites, stores, products etc. to that behavior and figure out how to be successful. Today's behavior that is somewhat easily tracked is web site clicks, search patterns and all of those things that a web site or web server tracks. that is where the Big Data lives and where these patters are now emerging. Other examples however are emerging, and one of the examples used at the conference was about prediction churn for a telco based on the social network its members are a part of. That social network is not about LinkedIn or Facebook, but about who calls whom. I call you a lot, you switch provider, and I might/will switch too. And that just naturally brings me to the next word, vertical. Vertical in this context means per industry, e.g. communications or retail or government or any other vertical. The reason for being more specific than just behavioral analytics is that each industry has its own data sources, has its own quirky logic and has its own demands and priorities. Of course, the methods and some of the software will be common and some will have both retail and service industry analytics in place (your corner coffee store for example). But the gist of it all is that analytics that can predict customer behavior for a specific focused group of people in a specific industry is what makes Big Data interesting. Building a Vertical Behavioral Analysis System Well, that is going to be interesting. I have not seen much going on in that space and if I had to have some criticism on the cloud connect conference it would be the lack of concrete user cases on big data. The telco example, while a step into the vertical behavioral part is not really on big data. It used a sample of data from the customers' data warehouse. One thing I do think, and this is where I think parts of the NoSQL stuff come from, is that we will be doing this analysis where the data is. Over the past 10 years we at Oracle have called this in-database analytics. I guess we were (too) early? Now the entire market is going there including companies like SAS. In-place btw does not mean "no data movement at all", what it means that you will do this on data's permanent home. For SAS that is kind of the current problem. Most of the inputs live in a data warehouse. So why move it into SAS and back? That all worked with 1 TB data warehouses, but when we are looking at 100TB to 500 TB of distilled data... Comments? As it is still early days with these systems, I'm very interested in seeing reactions and thoughts to some of these thoughts...

Read the article
Is RTD Stateless or Stateful?

- by [email protected]

Yes. A stateless service is one where each request is an independent transaction that can be processed by any of the servers in a cluster. A stateful service is one where state is kept in a server's memory from transaction to transaction, thus necessitating the proper routing of requests to the right server. The main advantage of stateless systems is simplicity of design. The main advantage of stateful systems is performance. I'm often asked whether RTD is a stateless or stateful service, so I wanted to clarify this issue in depth so that RTD's architecture will be properly understood. The short answer is: "RTD can be configured as a stateless or stateful service." The performance difference between stateless and stateful systems can be very significant, and while in a call center implementation it may be reasonable to use a pure stateless configuration, a web implementation that produces thousands of requests per second is practically impossible with a stateless configuration. RTD's performance is orders of magnitude better than most competing systems. RTD was architected from the ground up to achieve this performance. Features like automatic and dynamic compression of prediction models, automatic translation of metadata to machine code, lack of interpreted languages, and separation of model building from decisioning contribute to achieving this performance level. Because of this focus on performance we decided to have RTD's default configuration work in a stateful manner. By being stateful RTD requests are typically handled in a few milliseconds when repeated requests come to the same session. Now, those readers that have participated in implementations of RTD know that RTD's architecture is also focused on reducing Total Cost of Ownership (TCO) with features like automatic model building, automatic time windows, automatic maintenance of database tables, automatic evaluation of data mining models, automatic management of models partitioned by channel, geography, etcetera, and hot swapping of configurations. How do you reconcile the need for a low TCO and the need for performance? How do you get the performance of a stateful system with the simplicity of a stateless system? The answer is that you make the system behave like a stateless system to the exterior, but you let it automatically take advantage of situations where being stateful is better. For example, one of the advantages of stateless systems is that you can route a message to any server in a cluster, without worrying about sending it to the same server that was handling the session in previous messages. With an RTD stateful configuration you can still route the message to any server in the cluster, so from the point of view of the configuration of other systems, it is the same as a stateless service. The difference though comes in performance, because if the message arrives to the right server, RTD can serve it without any external access to the session's state, thus tremendously reducing processing time. In typical implementations it is not rare to have high percentages of messages routed directly to the right server, while those that are not, are easily handled by forwarding the messages to the right server. This architecture usually provides the best of both worlds with performance and simplicity of configuration. Configuring RTD as a pure stateless service A pure stateless configuration requires session data to be persisted at the end of handling each and every message and reloading that data at the beginning of handling any new message. This is of course, the root of the inefficiency of these configurations. This is also the reason why many "stateless" implementations actually do keep state to take advantage of a request coming back to the same server. Nevertheless, if the implementation requires a pure stateless decision service, this is easy to configure in RTD. The way to do it is: Mark every Integration Point to Close the session at the end of processing the message In the Session entity persist the session data on closing the session In the session entity check if a persisted version exists and load it An excellent solution for persisting the session data is Oracle Coherence, which provides a high performance, distributed cache that minimizes the performance impact of persisting and reloading the session. Alternatively, the session can be persisted to a local database. An interesting feature of the RTD stateless configuration is that it can cope with serializing concurrent requests for the same session. For example, if a web page produces two requests to the decision service, these requests could come concurrently to the decision services and be handled by different servers. Most stateless implementation would have the two requests step onto each other when saving the state, or fail one of the messages. When properly configured, RTD will make one message wait for the other before processing. A Word on Context Using the context of a customer interaction typically significantly increases lift. For example, offer success in a call center could double if the context of the call is taken into account. For this reason, it is important to utilize the contextual information in decision making. To make the contextual information available throughout a session it needs to be persisted. When there is a well defined owner for the information then there is no problem because in case of a session restart, the information can be easily retrieved. If there is no official owner of the information, then RTD can be configured to persist this information. Once again, RTD provides flexibility to ensure high performance when it is adequate to allow for some loss of state in the rare cases of server failure. For example, in a heavy use web site that serves 1000 pages per second the navigation history may be stored in the in memory session. In such sites it is typical that there is no OLTP that stores all the navigation events, therefore if an RTD server were to fail, it would be possible for the navigation to that point to be lost (note that a new session would be immediately established in one of the other servers). In most cases the loss of this navigation information would be acceptable as it would happen rarely. If it is desired to save this information, RTD would persist it every time the visitor navigates to a new page. Note that this practice is preferred whether RTD is configured in a stateless or stateful manner.

Read the article
C# Neural Networks with Encog

- by JoshReuben

Neural Networks · I recently read a book Introduction to Neural Networks for C# , by Jeff Heaton. http://www.amazon.com/Introduction-Neural-Networks-C-2nd/dp/1604390093/ref=sr_1_2?ie=UTF8&s=books&qid=1296821004&sr=8-2-spell. Not the 1st ANN book I've perused, but a nice revision. · Artificial Neural Networks (ANNs) are a mechanism of machine learning – see http://en.wikipedia.org/wiki/Artificial_neural_network , http://en.wikipedia.org/wiki/Category:Machine_learning · Problems Not Suited to a Neural Network Solution- Programs that are easily written out as flowcharts consisting of well-defined steps, program logic that is unlikely to change, problems in which you must know exactly how the solution was derived. · Problems Suited to a Neural Network – pattern recognition, classification, series prediction, and data mining. Pattern recognition - network attempts to determine if the input data matches a pattern that it has been trained to recognize. Classification - take input samples and classify them into fuzzy groups. · As far as machine learning approaches go, I thing SVMs are superior (see http://en.wikipedia.org/wiki/Support_vector_machine ) - a neural network has certain disadvantages in comparison: an ANN can be overtrained, different training sets can produce non-deterministic weights and it is not possible to discern the underlying decision function of an ANN from its weight matrix – they are black box. · In this post, I'm not going to go into internals (believe me I know them). An autoassociative network (e.g. a Hopfield network) will echo back a pattern if it is recognized. · Under the hood, there is very little maths. In a nutshell - Some simple matrix operations occur during training: the input array is processed (normalized into bipolar values of 1, -1) - transposed from input column vector into a row vector, these are subject to matrix multiplication and then subtraction of the identity matrix to get a contribution matrix. The dot product is taken against the weight matrix to yield a boolean match result. For backpropogation training, a derivative function is required. In learning, hill climbing mechanisms such as Genetic Algorithms and Simulated Annealing are used to escape local minima. For unsupervised training, such as found in Self Organizing Maps used for OCR, Hebbs rule is applied. · The purpose of this post is not to mire you in technical and conceptual details, but to show you how to leverage neural networks via an abstraction API - Encog Encog · Encog is a neural network API · Links to Encog: http://www.encog.org , http://www.heatonresearch.com/encog, http://www.heatonresearch.com/forum · Encog requires .Net 3.5 or higher – there is also a Silverlight version. Third-Party Libraries – log4net and nunit. · Encog supports feedforward, recurrent, self-organizing maps, radial basis function and Hopfield neural networks. · Encog neural networks, and related data, can be stored in .EG XML files. · Encog Workbench allows you to edit, train and visualize neural networks. The Encog Workbench can generate code. Synapses and layers · the primary building blocks - Almost every neural network will have, at a minimum, an input and output layer. In some cases, the same layer will function as both input and output layer. · To adapt a problem to a neural network, you must determine how to feed the problem into the input layer of a neural network, and receive the solution through the output layer of a neural network. · The Input Layer - For each input neuron, one double value is stored. An array is passed as input to a layer. Encog uses the interface INeuralData to hold these arrays. The class BasicNeuralData implements the INeuralData interface. Once the neural network processes the input, an INeuralData based class will be returned from the neural network's output layer. · convert a double array into an INeuralData object : INeuralData data = new BasicNeuralData(= new double[10]); · the Output Layer- The neural network outputs an array of doubles, wraped in a class based on the INeuralData interface. · The real power of a neural network comes from its pattern recognition capabilities. The neural network should be able to produce the desired output even if the input has been slightly distorted. · Hidden Layers– optional. between the input and output layers. very much a “black box”. If the structure of the hidden layer is too simple it may not learn the problem. If the structure is too complex, it will learn the problem but will be very slow to train and execute. Some neural networks have no hidden layers. The input layer may be directly connected to the output layer. Further, some neural networks have only a single layer. A single layer neural network has the single layer self-connected. · connections, called synapses, contain individual weight matrixes. These values are changed as the neural network learns. Constructing a Neural Network · the XOR operator is a frequent “first example” -the “Hello World” application for neural networks. · The XOR Operator- only returns true when both inputs differ. 0 XOR 0 = 0 1 XOR 0 = 1 0 XOR 1 = 1 1 XOR 1 = 0 · Structuring a Neural Network for XOR - two inputs to the XOR operator and one output. · input: 0.0,0.0 1.0,0.0 0.0,1.0 1.0,1.0 · Expected output: 0.0 1.0 1.0 0.0 · A Perceptron - a simple feedforward neural network to learn the XOR operator. · Because the XOR operator has two inputs and one output, the neural network will follow suit. Additionally, the neural network will have a single hidden layer, with two neurons to help process the data. The choice for 2 neurons in the hidden layer is arbitrary, and often comes down to trial and error. · Neuron Diagram for the XOR Network · · The Encog workbench displays neural networks on a layer-by-layer basis. · Encog Layer Diagram for the XOR Network: · Create a BasicNetwork - Three layers are added to this network. the FinalizeStructure method must be called to inform the network that no more layers are to be added. The call to Reset randomizes the weights in the connections between these layers. var network = new BasicNetwork(); network.AddLayer(new BasicLayer(2)); network.AddLayer(new BasicLayer(2)); network.AddLayer(new BasicLayer(1)); network.Structure.FinalizeStructure(); network.Reset(); · Neural networks frequently start with a random weight matrix. This provides a starting point for the training methods. These random values will be tested and refined into an acceptable solution. However, sometimes the initial random values are too far off. Sometimes it may be necessary to reset the weights again, if training is ineffective. These weights make up the long-term memory of the neural network. Additionally, some layers have threshold values that also contribute to the long-term memory of the neural network. Some neural networks also contain context layers, which give the neural network a short-term memory as well. The neural network learns by modifying these weight and threshold values. · Now that the neural network has been created, it must be trained. Training a Neural Network · construct a INeuralDataSet object - contains the input array and the expected output array (of corresponding range). Even though there is only one output value, we must still use a two-dimensional array to represent the output. public static double[][] XOR_INPUT ={ new double[2] { 0.0, 0.0 }, new double[2] { 1.0, 0.0 }, new double[2] { 0.0, 1.0 }, new double[2] { 1.0, 1.0 } }; public static double[][] XOR_IDEAL = { new double[1] { 0.0 }, new double[1] { 1.0 }, new double[1] { 1.0 }, new double[1] { 0.0 } }; INeuralDataSet trainingSet = new BasicNeuralDataSet(XOR_INPUT, XOR_IDEAL); · Training is the process where the neural network's weights are adjusted to better produce the expected output. Training will continue for many iterations, until the error rate of the network is below an acceptable level. Encog supports many different types of training. Resilient Propagation (RPROP) - general-purpose training algorithm. All training classes implement the ITrain interface. The RPROP algorithm is implemented by the ResilientPropagation class. Training the neural network involves calling the Iteration method on the ITrain class until the error is below a specific value. The code loops through as many iterations, or epochs, as it takes to get the error rate for the neural network to be below 1%. Once the neural network has been trained, it is ready for use. ITrain train = new ResilientPropagation(network, trainingSet); for (int epoch=0; epoch < 10000; epoch++) { train.Iteration(); Debug.Print("Epoch #" + epoch + " Error:" + train.Error); if (train.Error > 0.01) break; } Executing a Neural Network · Call the Compute method on the BasicNetwork class. Console.WriteLine("Neural Network Results:"); foreach (INeuralDataPair pair in trainingSet) { INeuralData output = network.Compute(pair.Input); Console.WriteLine(pair.Input[0] + "," + pair.Input[1] + ", actual=" + output[0] + ",ideal=" + pair.Ideal[0]); } · The Compute method accepts an INeuralData class and also returns a INeuralData object. Neural Network Results: 0.0,0.0, actual=0.002782538818034049,ideal=0.0 1.0,0.0, actual=0.9903741937121177,ideal=1.0 0.0,1.0, actual=0.9836807956566187,ideal=1.0 1.0,1.0, actual=0.0011646072586172778,ideal=0.0 · the network has not been trained to give the exact results. This is normal. Because the network was trained to 1% error, each of the results will also be within generally 1% of the expected value.

Read the article
Microsoft’s new technical computing initiative

- by Randy Walker

I made a mental note from earlier in the year. Microsoft literally buys computers by the truckload. From what I understand, it’s a typical practice amongst large software vendors. You plug a few wires in, you test it, and you instantly have mega tera tera flops (don’t hold me to that number). Microsoft has been trying to plug away at their cloud services (named Azure). Which, for the layman, means Microsoft runs your software on their computers, and as demand increases you can allocate more computing power on the fly. With this in mind, it doesn’t surprise me that I was recently sent an executive email concerning Microsoft’s new technical computing initiative. I find it to be a great marketing idea with actual substance behind their real work. From the programmer academic perspective, in college we dreamed about this type of processing power. This has decades of computer science theory behind it. A copy of the email received. (note that I almost deleted this email, thinking it was spam due to it’s length) We don't often think about how complex life really is. Take the relatively simple task of commuting to and from work: it is, in fact, a complicated interplay of variables such as weather, train delays, accidents, traffic patterns, road construction, etc. You can however, take steps to shorten your commute - using a good, predictive understanding of a few of these variables. In fact, you probably are already taking these inputs and instinctively building a predictive model that you act on daily to get to your destination more quickly. Now, when we apply the same method to very complex tasks, this modeling approach becomes much more challenging. Recent world events clearly demonstrated our inability to process vast amounts of information and variables that would have helped to more accurately predict the behavior of global financial markets or the occurrence and impact of a volcano eruption in Iceland. To make sense of issues like these, researchers, engineers and analysts create computer models of the almost infinite number of possible interactions in complex systems. But, they need increasingly more sophisticated computer models to better understand how the world behaves and to make fact-based predictions about the future. And, to do this, it requires a tremendous amount of computing power to process and examine the massive data deluge from cameras, digital sensors and precision instruments of all kinds. This is the key to creating more accurate and realistic models that expose the hidden meaning of data, which gives us the kind of insight we need to solve a myriad of challenges. We have made great strides in our ability to build these kinds of computer models, and yet they are still too difficult, expensive and time consuming to manage. Today, even the most complicated data-rich simulations cannot fully capture all of the intricacies and dependencies of the systems they are trying to model. That is why, across the scientific and engineering world, it is so hard to say with any certainty when or where the next volcano will erupt and what flight patterns it might affect, or to more accurately predict something like a global flu pandemic. So far, we just cannot collect, correlate and compute enough data to create an accurate forecast of the real world. But this is about to change. Innovations in technology are transforming our ability to measure, monitor and model how the world behaves. The implication for scientific research is profound, and it will transform the way we tackle global challenges like health care and climate change. It will also have a huge impact on engineering and business, delivering breakthroughs that could lead to the creation of new products, new businesses and even new industries. Because you are a subscriber to executive e-mails from Microsoft, I want you to be the first to know about a new effort focused specifically on empowering millions of the world's smartest problem solvers. Today, I am happy to introduce Microsoft's Technical Computing initiative. Our goal is to unleash the power of pervasive, accurate, real-time modeling to help people and organizations achieve their objectives and realize their potential. We are bringing together some of the brightest minds in the technical computing community across industry, academia and science at www.modelingtheworld.com to discuss trends, challenges and shared opportunities. New advances provide the foundation for tools and applications that will make technical computing more affordable and accessible where mathematical and computational principles are applied to solve practical problems. One day soon, complicated tasks like building a sophisticated computer model that would typically take a team of advanced software programmers months to build and days to run, will be accomplished in a single afternoon by a scientist, engineer or analyst working at the PC on their desktop. And as technology continues to advance, these models will become more complete and accurate in the way they represent the world. This will speed our ability to test new ideas, improve processes and advance our understanding of systems. Our technical computing initiative reflects the best of Microsoft's heritage. Ever since Bill Gates articulated the then far-fetched vision of "a computer on every desktop" in the early 1980's, Microsoft has been at the forefront of expanding the power and reach of computing to benefit the world. As someone who worked closely with Bill for many years at Microsoft, I am happy to share with you that the passion behind that vision is fully alive at Microsoft and is carried out in the creation of our new Technical Computing group. Enabling more people to make better predictions We have seen the impact of making greater computing power more available firsthand through our investments in high performance computing (HPC) over the past five years. Scientists, engineers and analysts in organizations of all sizes and sectors are finding that using distributed computational power creates societal impact, fuels scientific breakthroughs and delivers competitive advantages. For example, we have seen remarkable results from some of our current customers: Malaria strikes 300,000 to 500,000 people around the world each year. To help in the effort to eradicate malaria worldwide, scientists at Intellectual Ventures use software that simulates how the disease spreads and would respond to prevention and control methods, such as vaccines and the use of bed nets. Technical computing allows researchers to model more detailed parameters for more accurate results and receive those results in less than an hour, rather than waiting a full day. Aerospace engineering firm, a.i. solutions, Inc., needed a more powerful computing platform to keep up with the increasingly complex computational needs of its customers: NASA, the Department of Defense and other government agencies planning space flights. To meet that need, it adopted technical computing. Now, a.i. solutions can produce detailed predictions and analysis of the flight dynamics of a given spacecraft, from optimal launch times and orbit determination to attitude control and navigation, up to eight times faster. This enables them to avoid mistakes in any areas that can cause a space mission to fail and potentially result in the loss of life and millions of dollars. Western & Southern Financial Group faced the challenge of running ever larger and more complex actuarial models as its number of policyholders and products grew and regulatory requirements changed. The company chose an actuarial solution that runs on technical computing technology. The solution is easy for the company's IT staff to manage and adjust to meet business needs. The new solution helps the company reduce modeling time by up to 99 percent - letting the team fine-tune its models for more accurate product pricing and financial projections. Our Technical Computing direction Collaborating closely with partners across industry and academia, we must now extend the reach of technical computing even further to help predictive modelers and data explorers make faster, more accurate predictions. As we build the Technical Computing initiative, we will invest in three core areas: Technical computing to the cloud: Microsoft will play a leading role in bringing technical computing power to scientists, engineers and analysts through the cloud. Existing high- performance computing users will benefit from the ability to augment their on-premises systems with cloud resources that enable 'just-in-time' processing. This platform will help ensure processing resources are available whenever they are needed-reliably, consistently and quickly. Simplify parallel development: Today, computers are shipping with more processing power than ever, including multiple cores, but most modern software only uses a small amount of the available processing power. Parallel programs are extremely difficult to write, test and trouble shoot. However, a consistent model for parallel programming can help more developers unlock the tremendous power in today's modern computers and enable a new generation of technical computing. We are delivering new tools to automate and simplify writing software through parallel processing from the desktop... to the cluster... to the cloud. Develop powerful new technical computing tools and applications: We know scientists, engineers and analysts are pushing common tools (i.e., spreadsheets and databases) to the limits with complex, data-intensive models. They need easy access to more computing power and simplified tools to increase the speed of their work. We are building a platform to do this. Our development efforts will yield new, easy-to-use tools and applications that automate data acquisition, modeling, simulation, visualization, workflow and collaboration. This will allow them to spend more time on their work and less time wrestling with complicated technology. Thinking bigger There is so much left to be discovered and so many questions yet to be answered in the fascinating world around us. We believe the technical computing community will show us that we have not seen anything yet. Imagine just some of the breakthroughs this community could make possible: Better predictions to help improve the understanding of pandemics, contagion and global health trends. Climate change models that predict environmental, economic and human impact, accessible in real-time during key discussions and debates. More accurate prediction of natural disasters and their impact to develop more effective emergency response plans. With an ambitious charter in hand, this new team is ready to build on our progress to-date and execute Microsoft's technical computing vision over the months and years ahead. We will steadily invest in the right technologies, tools and talent, and work to bring together the technical computing community. I invite you to visit www.modelingtheworld.com today. We welcome your ideas and feedback. I look forward to making this journey with you and others who want to answer the world's biggest questions, discover solutions to problems that seem impossible and uncover a host of new opportunities to change the world we live in for the better. Bob

Read the article
CodePlex Daily Summary for Monday, November 22, 2010

CodePlex Daily Summary for Monday, November 22, 2010Popular ReleasesSQL Monitor: SQLMon 1.1: changes: 1.added sql job monitoring; 2.added settings save/loadASP.NET MVC Project Awesome (jQuery Ajax helpers): 1.3.1 and demos: A rich set of helpers (controls) that you can use to build highly responsive and interactive Ajax-enabled Web applications. These helpers include Autocomplete, AjaxDropdown, Lookup, Confirm Dialog, Popup Form and Pager tested on mozilla, safari, chrome, opera, ie 9b/8/7/6DotSpatial: DotSpatial 11-21-2010: This release introduces the following Fixed bugs related to dispose, which caused issues when reordering layers in the legend Fixed bugs related to assigning categories where NULL values are in the fields New fast-acting resize using a bitmap "prediction" of what the final resize content will look like. ImageData.ReadBlock, ImageData.WriteBlock These allow direct file access for reading or writing a rectangular window. Bitmaps are used for holding the values. Removed the need to stor...Minemapper - dynamic mapping for Windows: Minemapper v0.1.0: Pan by: dragging the mouse using the buttons Zoom by: scrolling the mouse wheel using the buttons using the slider Night support Biome support Skylight support Direction support: East West Height slicingMDownloader: MDownloader-0.15.24.6966: Fixed Updater; Fixed minor bugs;WPF Application Framework (WAF): WPF Application Framework (WAF) 2.0.0.1: Version: 2.0.0.1 (Milestone 1): This release contains the source code of the WPF Application Framework (WAF) and the sample applications. Requirements .NET Framework 4.0 (The package contains a solution file for Visual Studio 2010) The unit test projects require Visual Studio 2010 Professional Remark The sample applications are using Microsoft’s IoC container MEF. However, the WPF Application Framework (WAF) doesn’t force you to use the same IoC container in your application. You can use ...Smith Html Editor: Smith Html Editor V0.75: The first public release.MiniTwitter: 1.59: MiniTwitter 1.59 ???? ?? User Streams ????????????????? ?? ?????????????? ???????? ?????????????.NET Extensions - Extension Methods Library for C# and VB.NET: Release 2011.01: Added new extensions for - object.CountLoopsToNull Added new extensions for DateTime: - DateTime.IsWeekend - DateTime.AddWeeks Added new extensions for string: - string.Repeat - string.IsNumeric - string.ExtractDigits - string.ConcatWith - string.ToGuid - string.ToGuidSave Added new extensions for Exception: - Exception.GetOriginalException Added new extensions for Stream: - Stream.Write (overload) And other new methods ... Release as of dotnetpro 01/2011Code Sample from Microsoft: Visual Studio 2010 Code Samples 2010-11-19: Code samples for Visual Studio 2010Prism Training Kit: Prism Training Kit 4.0: Release NotesThis is an updated version of the Prism training Kit that targets Prism 4.0 and added labs for some of the new features of Prism 4.0. This release consists of a Training Kit with Labs on the following topics Modularity Dependency Injection Bootstrapper UI Composition Communication MEF Navigation Note: Take into account that this is a Beta version. If you find any bugs please report them in the Issue Tracker PrerequisitesVisual Studio 2010 Microsoft Word 2...Free language translator and file converter: Free Language Translator 2.2: Starting with version 2.0, the translator encountered a major redesign that uses MEF based plugins and .net 4.0. I've also fixed some bugs and added support for translating subtitles that can show up in video media players. Version 2.1 shows the context menu 'Translate' in Windows Explorer on right click. Version 2.2 has links to start the media file with its associated subtitle. Download the zip file and expand it in a temporary location on your local disk. At a minimum , you should uninstal...Free Silverlight & WPF Chart Control - Visifire: Visifire SL and WPF Charts v3.6.4 Released: Hi, Today we are releasing Visifire 3.6.4 with few bug fixes: * Multi-line Labels were getting clipped while exploding last DataPoint in Funnel and Pyramid chart. * ClosestPlotDistance property in Axis was not behaving as expected. * In DateTime Axis, Chart threw exception on mouse click over PlotArea if there were no DataPoints present in Chart. * ToolTip was not disappearing while changing the DataSource property of the DataSeries at real-time. * Chart threw exception ...Microsoft SQL Server Product Samples: Database: AdventureWorks 2008R2 SR1: Sample Databases for Microsoft SQL Server 2008R2 (SR1)This release is dedicated to the sample databases that ship for Microsoft SQL Server 2008R2. See Database Prerequisites for SQL Server 2008R2 for feature configurations required for installing the sample databases. See Installing SQL Server 2008R2 Databases for step by step installation instructions. The SR1 release contains minor bug fixes to the installer used to create the sample databases. There are no changes to the databases them...VidCoder: 0.7.2: Fixed duplicated subtitles when running multiple encodes off of the same title.Craig's Utility Library: Craig's Utility Library Code 2.0: This update contains a number of changes, added functionality, and bug fixes: Added transaction support to SQLHelper. Added linked/embedded resource ability to EmailSender. Updated List to take into account new functions. Added better support for MAC address in WMI classes. Fixed Parsing in Reflection class when dealing with sub classes. Fixed bug in SQLHelper when replacing the Command that is a select after doing a select. Fixed issue in SQL Server helper with regard to generati...MFCMAPI: November 2010 Release: Build: 6.0.0.1023 Full release notes at SGriffin's blog. If you just want to run the tool, get the executable. If you want to debug it, get the symbol file and the source. The 64 bit build will only work on a machine with Outlook 2010 64 bit installed. All other machines should use the 32 bit build, regardless of the operating system. Facebook BadgeDotNetNuke® Community Edition: 05.06.00: Major HighlightsAdded automatic portal alias creation for single portal installs Updated the file manager upload page to allow user to upload multiple files without returning to the file manager page. Fixed issue with Event Log Email Notifications. Fixed issue where Telerik HTML Editor was unable to upload files to secure or database folder. Fixed issue where registration page is not set correctly during an upgrade. Fixed issue where Sendmail stripped HTML and Links from emails...mVu Mobile Viewer: mVu Mobile Viewer 0.7.10.0: Tube8 fix.EPPlus-Create advanced Excel 2007 spreadsheets on the server: EPPlus 2.8.0.1: EPPlus-Create advanced Excel 2007 spreadsheets on the serverNew Features Improved chart support Different chart-types series on the same chart Support for secondary axis and a lot of new properties Better styling Encryption and Workbook protection Table support Import csv files Array formulas ...and a lot of bugfixesNew Projects.NET 4 Workflow Activities for Citrix: .NET 4 based workflow activities targeting the Citrix infrastructure.Age calculator: It calculates the age of a person in days on specification of date of birth.Another Azure Demo Project: An Azure demo project - based on the one we (Johan Danforth and Dag König) showed on the Swedish Azure Summit.ASP.NET Layered Web Application: N-Layered Web Applications with ASP.NET based on the article by Imar Spaanjaars.Binzlog: Donet ????。Build Solution: Buid Visual Studio applications with .Net code.CondominioOnline: Projeto para o desenvolvimento colaborativo dos diagramas de desenvolvimento.Create Dynamic UI with WPF: Create Dynamic UI with WPFDNN Fanbox: dot net nuke plugin facebook fanboxDNN Tweet: DNN Tweet is a twitter plugin for DotnetNuke DotNetNuke Notes: dnnNotes allows you to create simple notes that are stored on your DotNetNuke site.Easy Login PHP Script: Give your site a professional looking Members Area with this completely FREE and easy-to-use PHP script! Developed in PHP and uses MySQL as a database backend. Go on, click here, you know you want to! :DFind Nigerian Traditional Fashion Styles: NaijaTradStyles is a social network for Nigerians all over the world to promote the Nigerian economy, designs and cultures, fashion designers and individuals. This site allows users to share fashion ideas, activities, events, and interests within their individual networks. The GreenArrow: Just a simple mark-locate-click automation tool by comparing graphic pieces. GreenArrow makes it easier for automation script writer to handle UI elements which cannot be located by normal methods, like keyword or classid. Libero API for Fusion Charts in ASP.Net: Libero.FusionChartsAPI is made for Asp.Net (Webforms and MVC) developers to make easier to implement Fusion Charts in their projects. It is developed in framework .Net 4 (but supports framework 3.5) to target ASP.Net projects. Minemapper - dynamic mapping for Windows: Minemapper is an interactive, dynamic mapper for Minecraft. It uses mcmap to generate small map image tiles, then lets you pan and zoom around, quickly generating new tiles as needed.MoodleAzure: Enable Moodle 1.9.9 to run on Windows Azure and SQL AzureOpalis Active Directory Extension: A Opalis Integration Pack Project for Active Directory Integration. Done with C# Directory Services.Quick Finger SDK: Quick Finger SDK helps you to build a wide range of applications to use fingerprint recognition. Quick Finger SDK makes it easier for developers to integrate fingerprint recognition into their software. It's developed in Visual C++. Regex Batch Replacer (Multi-File): Regex Batch Replacer uses regular expression to find and replace text in multiple files.RiverRaid X: A clone of the classic Atari 2600 arcade game, River Raid. Uses XNA 4.0 and Neat game engine (http://neat.codeplex.com)SharePoint Commander: SharePoint 2010 administrative tool for developers and administrators.StreamerMatch: A tool for streamers, focused at Starcraft II at the moment.Tab Web Part: This solution is used to present the WebParts in a tab like user interface. It is tested on a SharePoint 2010 sandboxed solution. With this solution, all the WebParts added in a particular zone will appear in a tab kind of interface in the design mode. The javascript transformsTomato: XNA-based rendering middleware.UnicornObjects: todoVina: VinaWPF Photo/Image Manager: A WPF playground for many projects, including an image viewer, filters, image modification, photo organization, etc.WXQCW: wxqcw news platformYobbo Guitar: Yobbo guitar is a web application developed in ASP.NET that allows users to share guitar songs and chord progressions.

Read the article
Fraud Detection with the SQL Server Suite Part 1

- by Dejan Sarka

While working on different fraud detection projects, I developed my own approach to the solution for this problem. In my PASS Summit 2013 session I am introducing this approach. I also wrote a whitepaper on the same topic, which was generously reviewed by my friend Matija Lah. In order to spread this knowledge faster, I am starting a series of blog posts which will at the end make the whole whitepaper. Abstract With the massive usage of credit cards and web applications for banking and payment processing, the number of fraudulent transactions is growing rapidly and on a global scale. Several fraud detection algorithms are available within a variety of different products. In this paper, we focus on using the Microsoft SQL Server suite for this purpose. In addition, we will explain our original approach to solving the problem by introducing a continuous learning procedure. Our preferred type of service is mentoring; it allows us to perform the work and consulting together with transferring the knowledge onto the customer, thus making it possible for a customer to continue to learn independently. This paper is based on practical experience with different projects covering online banking and credit card usage. Introduction A fraud is a criminal or deceptive activity with the intention of achieving financial or some other gain. Fraud can appear in multiple business areas. You can find a detailed overview of the business domains where fraud can take place in Sahin Y., & Duman E. (2011), Detecting Credit Card Fraud by Decision Trees and Support Vector Machines, Proceedings of the International MultiConference of Engineers and Computer Scientists 2011 Vol 1. Hong Kong: IMECS. Dealing with frauds includes fraud prevention and fraud detection. Fraud prevention is a proactive mechanism, which tries to disable frauds by using previous knowledge. Fraud detection is a reactive mechanism with the goal of detecting suspicious behavior when a fraudster surpasses the fraud prevention mechanism. A fraud detection mechanism checks every transaction and assigns a weight in terms of probability between 0 and 1 that represents a score for evaluating whether a transaction is fraudulent or not. A fraud detection mechanism cannot detect frauds with a probability of 100%; therefore, manual transaction checking must also be available. With fraud detection, this manual part can focus on the most suspicious transactions. This way, an unchanged number of supervisors can detect significantly more frauds than could be achieved with traditional methods of selecting which transactions to check, for example with random sampling. There are two principal data mining techniques available both in general data mining as well as in specific fraud detection techniques: supervised or directed and unsupervised or undirected. Supervised techniques or data mining models use previous knowledge. Typically, existing transactions are marked with a flag denoting whether a particular transaction is fraudulent or not. Customers at some point in time do report frauds, and the transactional system should be capable of accepting such a flag. Supervised data mining algorithms try to explain the value of this flag by using different input variables. When the patterns and rules that lead to frauds are learned through the model training process, they can be used for prediction of the fraud flag on new incoming transactions. Unsupervised techniques analyze data without prior knowledge, without the fraud flag; they try to find transactions which do not resemble other transactions, i.e. outliers. In both cases, there should be more frauds in the data set selected for checking by using the data mining knowledge compared to selecting the data set with simpler methods; this is known as the lift of a model. Typically, we compare the lift with random sampling. The supervised methods typically give a much better lift than the unsupervised ones. However, we must use the unsupervised ones when we do not have any previous knowledge. Furthermore, unsupervised methods are useful for controlling whether the supervised models are still efficient. Accuracy of the predictions drops over time. Patterns of credit card usage, for example, change over time. In addition, fraudsters continuously learn as well. Therefore, it is important to check the efficiency of the predictive models with the undirected ones. When the difference between the lift of the supervised models and the lift of the unsupervised models drops, it is time to refine the supervised models. However, the unsupervised models can become obsolete as well. It is also important to measure the overall efficiency of both, supervised and unsupervised models, over time. We can compare the number of predicted frauds with the total number of frauds that include predicted and reported occurrences. For measuring behavior across time, specific analytical databases called data warehouses (DW) and on-line analytical processing (OLAP) systems can be employed. By controlling the supervised models with unsupervised ones and by using an OLAP system or DW reports to control both, a continuous learning infrastructure can be established. There are many difficulties in developing a fraud detection system. As has already been mentioned, fraudsters continuously learn, and the patterns change. The exchange of experiences and ideas can be very limited due to privacy concerns. In addition, both data sets and results might be censored, as the companies generally do not want to publically expose actual fraudulent behaviors. Therefore it can be quite difficult if not impossible to cross-evaluate the models using data from different companies and different business areas. This fact stresses the importance of continuous learning even more. Finally, the number of frauds in the total number of transactions is small, typically much less than 1% of transactions is fraudulent. Some predictive data mining algorithms do not give good results when the target state is represented with a very low frequency. Data preparation techniques like oversampling and undersampling can help overcome the shortcomings of many algorithms. SQL Server suite includes all of the software required to create, deploy any maintain a fraud detection infrastructure. The Database Engine is the relational database management system (RDBMS), which supports all activity needed for data preparation and for data warehouses. SQL Server Analysis Services (SSAS) supports OLAP and data mining (in version 2012, you need to install SSAS in multidimensional and data mining mode; this was the only mode in previous versions of SSAS, while SSAS 2012 also supports the tabular mode, which does not include data mining). Additional products from the suite can be useful as well. SQL Server Integration Services (SSIS) is a tool for developing extract transform–load (ETL) applications. SSIS is typically used for loading a DW, and in addition, it can use SSAS data mining models for building intelligent data flows. SQL Server Reporting Services (SSRS) is useful for presenting the results in a variety of reports. Data Quality Services (DQS) mitigate the occasional data cleansing process by maintaining a knowledge base. Master Data Services is an application that helps companies maintaining a central, authoritative source of their master data, i.e. the most important data to any organization. For an overview of the SQL Server business intelligence (BI) part of the suite that includes Database Engine, SSAS and SSRS, please refer to Veerman E., Lachev T., & Sarka D. (2009). MCTS Self-Paced Training Kit (Exam 70-448): Microsoft® SQL Server® 2008 Business Intelligence Development and Maintenance. MS Press. For an overview of the enterprise information management (EIM) part that includes SSIS, DQS and MDS, please refer to Sarka D., Lah M., & Jerkic G. (2012). Training Kit (Exam 70-463): Implementing a Data Warehouse with Microsoft® SQL Server® 2012. O'Reilly. For details about SSAS data mining, please refer to MacLennan J., Tang Z., & Crivat B. (2009). Data Mining with Microsoft SQL Server 2008. Wiley. SQL Server Data Mining Add-ins for Office, a free download for Office versions 2007, 2010 and 2013, bring the power of data mining to Excel, enabling advanced analytics in Excel. Together with PowerPivot for Excel, which is also freely downloadable and can be used in Excel 2010, is already included in Excel 2013. It brings OLAP functionalities directly into Excel, making it possible for an advanced analyst to build a complete learning infrastructure using a familiar tool. This way, many more people, including employees in subsidiaries, can contribute to the learning process by examining local transactions and quickly identifying new patterns.

Read the article
Android: How/where to put gesture code into IME?

- by CardinalFIB

Hi, I'm new to Android but I'm trying to create an IME that allows for gesture-character recognition. I can already do simple apps that perform gesture recognition but am not sure where to hook in the gesture views/obj with an IME. Here is a starting skeleton of what I have for the IME so far. I would like to use android.gesture.Gesture/Prediction/GestureOverlayView/OnGesturePerformedListener. Does anyone have advice? -- CardinalFIB gestureIME.java public class gestureIME extends InputMethodService { private static Keyboard keyboard; private static KeyboardView kView; private int lastDisplayWidth; @Override public void onCreate() { super.onCreate(); } @Override public void onInitializeInterface() { int displayWidth; if (keyboard != null) { displayWidth = getMaxWidth(); if (displayWidth == lastDisplayWidth) return; else lastDisplayWidth = getMaxWidth(); } keyboard = new GestureKeyboard(this, R.xml.keyboard); } @Override public View onCreateInputView() { kView = (KeyboardView) getLayoutInflater().inflate(R.layout.input, null); kView.setOnKeyboardActionListener(kListener); kView.setKeyboard(keyboard); return kView; } @Override public View onCreateCandidatesView() { return null; } @Override public void onStartInputView(EditorInfo attribute, boolean restarting) { super.onStartInputView(attribute, restarting); kView.setKeyboard(keyboard); kView.closing(); //what does this do??? } @Override public void onStartInput(EditorInfo attribute, boolean restarting) { super.onStartInput(attribute, restarting); } @Override public void onFinishInput() { super.onFinishInput(); } public KeyboardView.OnKeyboardActionListener kListener = new KeyboardView.OnKeyboardActionListener() { @Override public void onKey(int keyCode, int[] otherKeyCodes) { if(keyCode==Keyboard.KEYCODE_CANCEL) handleClose(); if(keyCode==10) getCurrentInputConnection().commitText(String.valueOf((char) keyCode), 1); //keyCode RETURN } @Override public void onPress(int primaryCode) {} // TODO Auto-generated method stub @Override public void onRelease(int primaryCode) {} // TODO Auto-generated method stub @Override public void onText(CharSequence text) {} // TODO Auto-generated method stub @Override public void swipeDown() {} // TODO Auto-generated method stub @Override public void swipeLeft() {} // TODO Auto-generated method stub @Override public void swipeRight() {} // TODO Auto-generated method stub @Override public void swipeUp() {} // TODO Auto-generated method stub }; private void handleClose() { requestHideSelf(0); kView.closing(); } } GestureKeyboard.java package com.android.jt.gestureIME; import android.content.Context; import android.inputmethodservice.Keyboard; public class GestureKeyboard extends Keyboard { public GestureKeyboard(Context context, int xmlLayoutResId) { super(context, xmlLayoutResId); } } GesureKeyboardView.java package com.android.jt.gestureIME; import android.content.Context; import android.inputmethodservice.KeyboardView; import android.inputmethodservice.Keyboard.Key; import android.util.AttributeSet; public class GestureKeyboardView extends KeyboardView { public GestureKeyboardView(Context context, AttributeSet attrs) { super(context, attrs); } public GestureKeyboardView(Context context, AttributeSet attrs, int defStyle) { super(context, attrs, defStyle); } @Override protected boolean onLongPress(Key key) { return super.onLongPress(key); } } keyboard.xml <?xml version="1.0" encoding="utf-8"?> <Keyboard xmlns:android="http://schemas.android.com/apk/res/android" android:keyWidth="10%p" android:horizontalGap="0px" android:verticalGap="0px" android:keyHeight="@dimen/key_height" > <Row android:rowEdgeFlags="bottom"> <Key android:codes="-3" android:keyLabel="Close" android:keyWidth="20%p" android:keyEdgeFlags="left"/> <Key android:codes="10" android:keyLabel="Return" android:keyWidth="20%p" android:keyEdgeFlags="right"/> </Row> </Keyboard> input.xml <?xml version="1.0" encoding="utf-8"?> <com.android.jt.gestureIME.GestureKeyboardView xmlns:android="http://schemas.android.com/apk/res/android" android:id="@+id/gkeyboard" android:layout_alignParentBottom="true" android:layout_width="fill_parent" android:layout_height="wrap_content" />

Read the article
Languages and VMs: Features that are hard to optimize and why

- by mrjoltcola

I'm doing a survey of features in preparation for a research project. Name a mainstream language or language feature that is hard to optimize, and why the feature is or isn't worth the price paid, or instead, just debunk my theories below with anecdotal evidence. Before anyone flags this as subjective, I am asking for specific examples of languages or features, and ideas for optimization of these features, or important features that I haven't considered. Also, any references to implementations that prove my theories right or wrong. Top on my list of hard to optimize features and my theories (some of my theories are untested and are based on thought experiments): 1) Runtime method overloading (aka multi-method dispatch or signature based dispatch). Is it hard to optimize when combined with features that allow runtime recompilation or method addition. Or is it just hard, anyway? Call site caching is a common optimization for many runtime systems, but multi-methods add additional complexity as well as making it less practical to inline methods. 2) Type morphing / variants (aka value based typing as opposed to variable based) Traditional optimizations simply cannot be applied when you don't know if the type of someting can change in a basic block. Combined with multi-methods, inlining must be done carefully if at all, and probably only for a given threshold of size of the callee. ie. it is easy to consider inlining simple property fetches (getters / setters) but inlining complex methods may result in code bloat. The other issue is I cannot just assign a variant to a register and JIT it to the native instructions because I have to carry around the type info, or every variable needs 2 registers instead of 1. On IA-32 this is inconvenient, even if improved with x64's extra registers. This is probably my favorite feature of dynamic languages, as it simplifies so many things from the programmer's perspective. 3) First class continuations - There are multiple ways to implement them, and I have done so in both of the most common approaches, one being stack copying and the other as implementing the runtime to use continuation passing style, cactus stacks, copy-on-write stack frames, and garbage collection. First class continuations have resource management issues, ie. we must save everything, in case the continuation is resumed, and I'm not aware if any languages support leaving a continuation with "intent" (ie. "I am not coming back here, so you may discard this copy of the world"). Having programmed in the threading model and the contination model, I know both can accomplish the same thing, but continuations' elegance imposes considerable complexity on the runtime and also may affect cache efficienty (locality of stack changes more with use of continuations and co-routines). The other issue is they just don't map to hardware. Optimizing continuations is optimizing for the less-common case, and as we know, the common case should be fast, and the less-common cases should be correct. 4) Pointer arithmetic and ability to mask pointers (storing in integers, etc.) Had to throw this in, but I could actually live without this quite easily. My feelings are that many of the high-level features, particularly in dynamic languages just don't map to hardware. Microprocessor implementations have billions of dollars of research behind the optimizations on the chip, yet the choice of language feature(s) may marginalize many of these features (features like caching, aliasing top of stack to register, instruction parallelism, return address buffers, loop buffers and branch prediction). Macro-applications of micro-features don't necessarily pan out like some developers like to think, and implementing many languages in a VM ends up mapping native ops into function calls (ie. the more dynamic a language is the more we must lookup/cache at runtime, nothing can be assumed, so our instruction mix is made up of a higher percentage of non-local branching than traditional, statically compiled code) and the only thing we can really JIT well is expression evaluation of non-dynamic types and operations on constant or immediate types. It is my gut feeling that bytecode virtual machines and JIT cores are perhaps not always justified for certain languages because of this. I welcome your answers.

Read the article
Generating strongly biased radom numbers for tests

- by nobody

I want to run tests with randomized inputs and need to generate 'sensible' random numbers, that is, numbers that match good enough to pass the tested function's preconditions, but hopefully wreak havoc deeper inside its code. math.random() (I'm using Lua) produces uniformly distributed random numbers. Scaling these up will give far more big numbers than small numbers, and there will be very few integers. I would like to skew the random numbers (or generate new ones using the old function as a randomness source) in a way that strongly favors 'simple' numbers, but will still cover the whole range, I.e. extending up to positive/negative infinity (or ±1e309 for double). This means: numbers up to, say, ten should be most common, integers should be more common than fractions, numbers ending in 0.5 should be the most common fractions, followed by 0.25 and 0.75; then 0.125, and so on. A different description: Fix a base probability x such that probabilities will sum to one and define the probability of a number n as xk where k is the generation in which n is constructed as a surreal number1. That assigns x to 0, x2 to -1 and +1, x3 to -2, -1/2, +1/2 and +2, and so on. This gives a nice description of something close to what I want (it skews a bit too much), but is near-unusable for computing random numbers. The resulting distribution is nowhere continuous (it's fractal!), I'm not sure how to determine the base probability x (I think for infinite precision it would be zero), and computing numbers based on this by iteration is awfully slow (spending near-infinite time to construct large numbers). Does anyone know of a simple approximation that, given a uniformly distributed randomness source, produces random numbers very roughly distributed as described above? I would like to run thousands of randomized tests, quantity/speed is more important than quality. Still, better numbers mean less inputs get rejected. Lua has a JIT, so performance can't be reasonably predicted. Jumps based on randomness will break every prediction, and many calls to math.random() will be slow, too. This means a closed formula will be better than an iterative or recursive one. 1 Wikipedia has an article on surreal numbers, with a nice picture. A surreal number is a pair of two surreal numbers, i.e. x := {n|m}, and its value is the number in the middle of the pair, i.e. (for finite numbers) {n|m} = (n+m)/2 (as rational). If one side of the pair is empty, that's interpreted as increment (or decrement, if right is empty) by one. If both sides are empty, that's zero. Initially, there are no numbers, so the only number one can build is 0 := { | }. In generation two one can build numbers {0| } =: 1 and { |0} =: -1, in three we get {1| } =: 2, {|1} =: -2, {0|1} =: 1/2 and {-1|0} =: -1/2 (plus some more complex representations of known numbers, e.g. {-1|1} ? 0). Note that e.g. 1/3 is never generated by finite numbers because it is an infinite fraction – the same goes for floats, 1/3 is never represented exactly.

Read the article
Using an alternate search platform in Commerce Server 2009

- by Lewis Benge

Although Microsoft Commerce Server 2009's architecture is built upon Microsoft SQL Server, and has the full power of the SQL Full Text Indexing Search Platform, there are time however when you may require a richer or alternate search platform. One of these scenarios if when you want to implement a faceted (refinement) search into your site, which provides dynamic refinements based on the search results dataset. Faceted search is becoming popular in most online retail environments as a way of providing an enhanced user experience when browsing a larger catalogue. This is powerful for two reasons, firstly with a traditional search it is down to a user to think of a search term suitable for the product they are trying to find. This typically will not return similar products or help in any way to refine a larger dataset. Faceted searches on the other hand provide a comprehensive list of product properties, grouped together by similarity to help the user narrow down the results returned, as the user progressively restricts the search criteria by selecting additional criteria to search again, these facets needs to continually refresh. The whole experience allows users to explore alternate brands, price-ranges, or find products they hadn't initially thought of or where looking for in a bid to enhance cross sell in the retail environment. The second advantage of this type of search from a business perspective is also to harvest the search result to start to profile your user. Even though anonymous users may routinely visit your site, and will not necessarily register or complete a transaction to build up marketing data- profiling, you can still achieve the same result by recording search facets used within the search sequence. Below is a faceted search scenario generated from eBay using the search term "server". By creating a search profile of clicking through Computer & Networking -> Servers -> Dell - > New and recording this information against my user profile you can start to predict with a lot more certainty what types of products I am interested in. This will allow you to apply shopping-cart analysis against your search data and provide great cross-sale or advertising opportunity, or personalise the user experience based on your prediction of what the user may be interested in. This type of search is extremely beneficial in e-Commerce environments but achieving it out of the box with Commerce Server and SQL Full Text indexing can be challenging. In many deployments it is often easier to use an alternate search platform such as Microsoft's FAST, Apache SOLR, or Endecca, however you still want these products to integrate natively into Commerce Server to ensure that up-to-date inventory information is presented, profile information is generated, and you provide a consistant API. To do so we make the most of the Commerce Server extensibilty points called operation sequence components. In this example I will be talking about Apache Solr hosted on Apache Tomcat, in this specific example I have used the SolrNet C# library to interface to the Java platform. Also I am not going to talk about Solr configuration of indexing – but in a production envionrment this would typically happen by using Powershell to call the Commerce Server management webservice to export your catalog as XML, apply an XSLT transform to the file to make it conform to SOLR and use a simple HTTP Post to send it to the search enginge for indexing. Essentially a sequance component is a step in a serial workflow used to call a data repository (which in most cases is usually the Commerce Server pipelines or databases) and map to and from a Commerce Entity object whilst enforcing any business rules. So the first step in the process is to add a new class library to your existing Commerce Server site. You will need to use a new library as Sequence Components will need to be strongly named to be deployed. Once you are inside of your new project, add a new class file and add a reference to the Microsoft.Commerce.Providers, Microsoft.Commerce.Contracts and the Microsoft.Commerce.Broker assemblies. Now make your new class derive from the base object Microsoft.Commerce.Providers.Components.OperationSequanceComponent and overide the ExecuteQueryMethod. Your screen will then look something similar ot this: As all we are doing on this component is conducting a search we are only interested in the ExecuteQuery method. This method accepts three arguments, queryOperation, operationCache, and response. The queryOperation will be the object in which we receive our search parameters, the cache allows access to the Commerce Server cache allowing us to store regulary accessed information, and the response object is the object which we will return the result of our search upon. Inside this method is simply where we are going to inject our logic for our third party search platform. As I am not going to explain the inner-workings of actually making a SOLR call, I'll simply provide the sample code here. I would highly recommend however looking at the SolrNet wiki as they have some great explinations of how the API works. What you will find however is that there are some further extensions required when attempting to integrate a custom search provider. Firstly you out of the box the CommerceQueryOperation you will receive into the method when conducting a search against a catalog is specifically geared towards a SQL Full Text Search with properties such as a Where clause. To make the operation you receive more relevant you will need to create another class, this time derived from Microsoft.Commerce.Contract.Messages.CommerceSearchCriteria and within this you need to detail the properties you will require to allow you to submit as parameters to the SOLR search API. My exmaple looks like this: [DataContract(Namespace = "http://schemas.microsoft.com/microsoft-multi-channel-commerce-foundation/types/2008/03")] public class CommerceCatalogSolrSearch : CommerceSearchCriteria { private Dictionary<string, string> _facetQueries; public CommerceCatalogSolrSearch() { _facetQueries = new Dictionary<String, String>(); } public Dictionary<String, String> FacetQueries { get { return _facetQueries; } set { _facetQueries = value; } } public String SearchPhrase{ get; set; } public int PageIndex { get; set; } public int PageSize { get; set; } public IEnumerable<String> Facets { get; set; } public string Sort { get; set; } public new int FirstItemIndex { get { return (PageIndex-1)*PageSize; } } public int LastItemIndex { get { return FirstItemIndex + PageSize; } } } To allow you to construct a CommerceQueryOperation call within the API you will also need to construct another class to derived from Microsoft.Commerce.Common.MessageBuilders.CommerceSearchCriteriaBuilder and is simply used to construct an instance of the CommerceQueryOperation you have just created and expose the properties you want set. My Message builder looks like this: public class CommerceCatalogSolrSearchBuilder : CommerceSearchCriteriaBuilder { private CommerceCatalogSolrSearch _solrSearch; public CommerceCatalogSolrSearchBuilder() { _solrSearch = new CommerceCatalogSolrSearch(); } public String SearchPhrase { get { return _solrSearch.SearchPhrase; } set { _solrSearch.SearchPhrase = value; } } public int PageIndex { get { return _solrSearch.PageIndex; } set { _solrSearch.PageIndex = value; } } public int PageSize { get { return _solrSearch.PageSize; } set { _solrSearch.PageSize = value; } } public Dictionary<String,String> FacetQueries { get { return _solrSearch.FacetQueries; } set { _solrSearch.FacetQueries = value; } } public String[] Facets { get { return _solrSearch.Facets.ToArray(); } set { _solrSearch.Facets = value; } } public override CommerceSearchCriteria ToSearchCriteria() { return _solrSearch; } } Once you have these two classes in place you can now safely cast the CommerceOperation you receive as an argument of the overidden ExecuteQuery method in the SequenceComponent to the CommerceCatalogSolrSearch operation you have just created, e.g. public CommerceCatalogSolrSearch TryGetSearchCriteria(CommerceOperation operation) { var searchCriteria = operation as CommerceQueryOperation; if (searchCriteria == null) throw new Exception("No search criteria present"); var local = (CommerceCatalogSolrSearch) searchCriteria.SearchCriteria; if (local == null) throw new Exception("Unexpected Search Criteria in Operation"); return local; } Now you have all of your search parameters present, you can go off an call the external search platform API. You will of-course get proprietry objects returned, so the next step in the process is to convert the results being returned back into CommerceEntities. You do this via another extensibility point within the Commerce Server API called translatators. Translators are another separate class, this time derived inheriting the interface Microsoft.Commerce.Providers.Translators.IToCommerceEntityTranslator . As you can imaginge this interface is specific for the conversion of the object TO a CommerceEntity, you will need to implement a separate interface if you also need to go in the opposite direction. If you implement the required method for the interace you will get a single translate method which has a source onkect, destination CommerceEntity, and a collection of properties as arguments. For simplicity sake in this example I have hard-coded the mappings, however best practice would dictate you map the objects using your metadatadefintions.xml file . Once complete your translator would look something like the following: public class SolrEntityTranslator : IToCommerceEntityTranslator { #region IToCommerceEntityTranslator Members public void Translate(object source, CommerceEntity destinationCommerceEntity, CommercePropertyCollection propertiesToReturn) { if (source.GetType().Equals(typeof (SearchProduct))) { var searchResult = (SearchProduct) source; destinationCommerceEntity.Id = searchResult.ProductId; destinationCommerceEntity.SetPropertyValue("DisplayName", searchResult.Title); destinationCommerceEntity.ModelName = "Product"; } } Once you have a translator in place you can then safely map the results of your search platform into Commerce Entities and attach them on to the CommerceResponse object in a fashion similar to this: foreach (SearchProduct result in matchingProducts) { var destinationEntity = new CommerceEntity(_returnModelName); Translator.ToCommerceEntity(result, destinationEntity, _queryOperation.Model.Properties); response.CommerceEntities.Add(destinationEntity); } In SOLR I actually have two objects being returned – a product, and a collection of facets so I have an additional translator for facet (which maps to a custom facet CommerceEntity) and my facet response from SOLR is passed into the Translator helper class seperatley. When all of this is pieced together you have sucessfully completed the extensiblity point coding. You would have created a new OperationSequanceComponent, a custom SearchCritiera object and message builder class, and translators to convert the objects into Commerce Entities. Now you simply need to configure them, and can start calling them in your code. Make sure you sign you assembly, compile it and identiy its signature. Next you need to put this a reference of your new assembly into the Channel.Config configuration file replacing that of the existing SQL Full Text component: You will also need to add your translators to the Translators node of your Channel.Config too: Lastly add any custom CommerceEntities you have developed to your MetaDataDefintions.xml file. Your configuration is now complete, and you should now be able to happily make a call to the Commerce Foundation API, which will act as a proxy to your third party search platform and return back CommerceEntities of your search results. If you require data to be enriched, or logged, or any other logic applied then simply add further sequence components into the OperationSequence (obviously keeping the search response first) to the node of your Channel.Config file. Now to call your code you simply request it as per any other CommerceQuery operation, but taking into account you may be receiving multiple types of CommerceEntity returned: public KeyValuePair<FacetCollection ,List<Product>> DoFacetedProductQuerySearch(string searchPhrase, string orderKey, string sortOrder, int recordIndex, int recordsPerPage, Dictionary<string, string> facetQueries, out int totalItemCount) { var products = new List<Product>(); var query = new CommerceQuery<CatalogEntity, CommerceCatalogSolrSearchBuilder>(); query.SearchCriteria.PageIndex = recordIndex; query.SearchCriteria.PageSize = recordsPerPage; query.SearchCriteria.SearchPhrase = searchPhrase; query.SearchCriteria.FacetQueries = facetQueries; totalItemCount = 0; CommerceResponse response = SiteContext.ProcessRequest(query.ToRequest()); var queryResponse = response.OperationResponses[0] as CommerceQueryOperationResponse; // No results. Return the empty list if (queryResponse != null && queryResponse.CommerceEntities.Count == 0) return new KeyValuePair<FacetCollection, List<Product>>(); totalItemCount = (int)queryResponse.TotalItemCount; // Prepare a multi-operation to retrieve the product variants var multiOperation = new CommerceMultiOperation(); //Add products to results foreach (Product product in queryResponse.CommerceEntities.Where(x => x.ModelName == "Product")) { var productQuery = new CommerceQuery<Product>(Product.ModelNameDefinition); productQuery.SearchCriteria.Model.Id = product.Id; productQuery.SearchCriteria.Model.CatalogId = product.CatalogId; var variantQuery = new CommerceQueryRelatedItem<Variant>(Product.RelationshipName.Variants); productQuery.RelatedOperations.Add(variantQuery); multiOperation.Add(productQuery); } CommerceResponse variantsResponse = SiteContext.ProcessRequest(multiOperation.ToRequest()); foreach (CommerceQueryOperationResponse queryOpResponse in variantsResponse.OperationResponses) { if (queryOpResponse.CommerceEntities.Count() > 0) products.Add(queryOpResponse.CommerceEntities[0]); } //Get facet collection FacetCollection facetCollection = queryResponse.CommerceEntities.Where(x => x.ModelName == "FacetCollection").FirstOrDefault(); return new KeyValuePair<FacetCollection, List<Product>>(facetCollection, products); } ..And that is it – simply a few classes and some configuration will allow you to extend the Commerce Server query operations to call a third party search platform, whilst still maintaing a unifed API in the remainder of your code. This logic stands for any extensibility within CommerceServer, which requires excution in a serial fashioon such as call to LOB systems or web service to validate or enrich data. Feel free to use this example on other applications, and if you have any questions please feel free to e-mail and I'll help out where I can!

Read the article

Search Results

Search found 134 results on 6 pages for 'prediction'.

Page 5/6 | < Previous Page | 1 2 3 4 5 6 | Next Page >

- by Bob Rhubart

- by John Murdoch

- by compound eye

- by GGBlogger

- by Pinal Dave

- by Tony Berk

- by flies

- by user1521587

- by eric

- by Pinal Dave

- by Nick Harrison

- by alanc

- by NEETHU

- by danpalmer

- by JoshReuben

- by jean-pierre.dijcks

- by [email protected]

- by JoshReuben

- by Randy Walker

- by Dejan Sarka

- by CardinalFIB

- by mrjoltcola

- by nobody

- by Lewis Benge

< Previous Page | 1 2 3 4 5 6 | Next Page >