Search Results

Search found 13070 results on 523 pages for 'simply tom'.

Page 492/523 | < Previous Page | 488 489 490 491 492 493 494 495 496 497 498 499  | Next Page >

  • Rockmelt, the technology adoption model, and Facebook's spare internet

    - by Roger Hart
    Regardless of how good it is, you'd have to have a heart of stone not to make snide remarks about Rockmelt. After all, on the surface it looks a lot like some people spent two years building a browser instead of just bashing out a Chrome extension over a wet weekend. It probably does some more stuff. I don't know for sure because artificial scarcity is cool, apparently, so the "invitation" is still in the post*. I may in fact never know for sure, because I'm not wild about Facebook sign-in as a prerequisite for anything. From the video, and some initial reviews, my early reaction was: I have a browser, I have a Twitter client; what on earth is this for? The answer, of course, is "not me". Rockmelt is, in a way, quite audacious. Oh, sure, on launch day it's Bay Area bar-chat for the kids with no lenses in their retro specs and trousers that give you deep-vein thrombosis, but it's not really about them. Likewise,  Facebook just launched Google Wave, or something. And all the tech snobbery and scorn packed into describing it that way is irrelevant next to what they're doing with their platform. Here's something I drew in MS Paint** because I don't want to get sued: (see: The technology adoption lifecycle) A while ago in the Guardian, John Lanchester dusted off the idiom that "technology is stuff that doesn't work yet". The rest of the article would be quite interesting if it wasn't largely about MySpace, and he's sort of got a point. If you bolt on the sentiment that risk-averse businessmen like things that work, you've got the essence of Crossing the Chasm. Products for the mainstream market don't look much like technology. Think for  a second about early (1980s ish) hi-fi systems, with all the knobs and fiddly bits, their ostentatious technophile aesthetic. Then consider their sleeker and less (or at least less conspicuously) functional successors in the 1990s. The theory goes that innovators and early adopters like technology, it's a hobby in itself. The rest of the humans seem to like magic boxes with very few buttons that make stuff happen and never trouble them about why. Personally, I consider Apple's maddening insistence that iTunes is an acceptable way to move files around to be more or less morally unacceptable. Most people couldn't care less. Hence Rockmelt, and hence Facebook's continued growth. Rockmelt looks pointless to me, because I aggregate my social gubbins with Digsby, or use TweetDeck. But my use case is different and so are my enthusiasms. If I want to share photos, I'll use Flickr - but Facebook has photo sharing. If I want a short broadcast message, I'll use Twitter - Facebook has status updates. If I want to sell something with relatively little hassle, there's eBay - or Facebook marketplace. YouTube - check, FB Video. Email - messaging. Calendaring apps, yeah there are loads, or FB Events. What if I want to host a simple web page? Sure, they've got pages. Also Notes for blogging, and more games than I can count. This stuff is right there, where millions and millions of users are already, and for what they need it just works. It's not about me, because I'm not in the big juicy area under the curve. It's what 1990s portal sites could never have dreamed of achieving. Facebook is AOL on speed, crack, and some designer drugs it had specially imported from the future. It's a n00b-friendly gateway to the internet that just happens to serve up all the things you want to do online, right where you are. Oh, and everybody else is there too. The price of having all this and the social graph too is that you have all of this, and the social graph too. But plenty of folks have more incisive things to say than me about the whole privacy shebang, and it's not really what I'm talking about. Facebook is maintaining a vast, and fairly fully-featured training-wheels internet. And it makes up a large proportion of the online experience for a lot of people***. It's the entire web (2.0?) experience for the early and late majority. And sure, no individual bit of it is quite as slick or as fully-realised as something like Flickr (which wows me a bit every time I use it. Those guys are good at the web), but it doesn't have to be. It has to be unobtrusively good enough for the regular humans. It has to not feel like technology. This is what Rockmelt sort of is. You're online, you want something nebulously social, and you don't want to faff about with, say, Twitter clients. Wow! There it is on a really distracting sidebar, right in your browser. No effort! Yeah - fish nor fowl, much? It might work, I guess. There may be a demographic who want their social web experience more simply than tech tinkering, and who aren't just getting it from Facebook (or, for that matter, mobile devices). But I'd be surprised. Rockmelt feels like an attempt to grab a slice of Facebook-style "Look! It's right here, where you already are!", but it's still asking the mature market to install a new browser. Presumably this is where that Facebook sign-in predicate comes in handy, though it'll take some potent awareness marketing to make it fly. Meanwhile, Facebook quietly has the entire rest of the internet as a product management resource, and can continue to give most of the people most of what they want. Something that has not gone un-noticed in its potential to look a little sinister. But heck, they might even make Google Wave popular.     *This was true last week when I drafted this post. I got an invite subsequently, hence the screenshot. **MS Paint is no fun any more. It's actually good in Windows 7. Farewell ironically-shonky diagrams. *** It's also behind a single sign-in, lending a veneer of confidence, and partially solving the problem of usernames being crummy unique identifiers. I'll be blogging about that at some point.

    Read the article

  • Use Autoruns to Manually Clean an Infected PC

    - by Mark Virtue
    There are many anti-malware programs out there that will clean your system of nasties, but what happens if you’re not able to use such a program?  Autoruns, from SysInternals (recently acquired by Microsoft), is indispensable when removing malware manually. There are a few reasons why you may need to remove viruses and spyware manually: Perhaps you can’t abide running resource-hungry and invasive anti-malware programs on your PC You might need to clean your mom’s computer (or someone else who doesn’t understand that a big flashing sign on a website that says “Your computer is infected with a virus – click HERE to remove it” is not a message that can necessarily be trusted) The malware is so aggressive that it resists all attempts to automatically remove it, or won’t even allow you to install anti-malware software Part of your geek credo is the belief that anti-spyware utilities are for wimps Autoruns is an invaluable addition to any geek’s software toolkit.  It allows you to track and control all programs (and program components) that start automatically with Windows (or with Internet Explorer).  Virtually all malware is designed to start automatically, so there’s a very strong chance that it can be detected and removed with the help of Autoruns. We have covered how to use Autoruns in an earlier article, which you should read if you need to first familiarize yourself with the program. Autoruns is a standalone utility that does not need to be installed on your computer.  It can be simply downloaded, unzipped and run (link below).  This makes is ideally suited for adding to your portable utility collection on your flash drive. When you start Autoruns for the first time on a computer, you are presented with the license agreement: After agreeing to the terms, the main Autoruns window opens, showing you the complete list of all software that will run when your computer starts, when you log in, or when you open Internet Explorer: To temporarily disable a program from launching, uncheck the box next to it’s entry.  Note:  This does not terminate the program if it is running at the time – it merely prevents it from starting next time.  To permanently prevent a program from launching, delete the entry altogether (use the Delete key, or right-click and choose Delete from the context-menu)).  Note:  This does not remove the program from your computer – to remove it completely you need to uninstall the program (or otherwise delete it from your hard disk). Suspicious Software It can take a fair bit of experience (read “trial and error”) to become adept at identifying what is malware and what is not.  Most of the entries presented in Autoruns are legitimate programs, even if their names are unfamiliar to you.  Here are some tips to help you differentiate the malware from the legitimate software: If an entry is digitally signed by a software publisher (i.e. there’s an entry in the Publisher column) or has a “Description”, then there’s a good chance that it’s legitimate If you recognize the software’s name, then it’s usually okay.  Note that occasionally malware will “impersonate” legitimate software, but adopting a name that’s identical or similar to software you’re familiar with (e.g. “AcrobatLauncher” or “PhotoshopBrowser”).  Also, be aware that many malware programs adopt generic or innocuous-sounding names, such as “Diskfix” or “SearchHelper” (both mentioned below). Malware entries usually appear on the Logon tab of Autoruns (but not always!) If you open up the folder that contains the EXE or DLL file (more on this below), an examine the “last modified” date, the dates are often from the last few days (assuming that your infection is fairly recent) Malware is often located in the C:\Windows folder or the C:\Windows\System32 folder Malware often only has a generic icon (to the left of the name of the entry) If in doubt, right-click the entry and select Search Online… The list below shows two suspicious looking entries:  Diskfix and SearchHelper These entries, highlighted above, are fairly typical of malware infections: They have neither descriptions nor publishers They have generic names The files are located in C:\Windows\System32 They have generic icons The filenames are random strings of characters If you look in the C:\Windows\System32 folder and locate the files, you’ll see that they are some of the most recently modified files in the folder (see below) Double-clicking on the items will take you to their corresponding registry keys: Removing the Malware Once you’ve identified the entries you believe to be suspicious, you now need to decide what you want to do with them.  Your choices include: Temporarily disable the Autorun entry Permanently delete the Autorun entry Locate the running process (using Task Manager or similar) and terminating it Delete the EXE or DLL file from your disk (or at least move it to a folder where it won’t be automatically started) or all of the above, depending upon how certain you are that the program is malware. To see if your changes succeeded, you will need to reboot your machine, and check any or all of the following: Autoruns – to see if the entry has returned Task Manager (or similar) – to see if the program was started again after the reboot Check the behavior that led you to believe that your PC was infected in the first place.  If it’s no longer happening, chances are that your PC is now clean Conclusion This solution isn’t for everyone and is most likely geared to advanced users. Usually using a quality Antivirus application does the trick, but if not Autoruns is a valuable tool in your Anti-Malware kit. Keep in mind that some malware is harder to remove than others.  Sometimes you need several iterations of the steps above, with each iteration requiring you to look more carefully at each Autorun entry.  Sometimes the instant that you remove the Autorun entry, the malware that is running replaces the entry.  When this happens, we need to become more aggressive in our assassination of the malware, including terminating programs (even legitimate programs like Explorer.exe) that are infected with malware DLLs. Shortly we will be publishing an article on how to identify, locate and terminate processes that represent legitimate programs but are running infected DLLs, in order that those DLLs can be deleted from the system. Download Autoruns from SysInternals Similar Articles Productive Geek Tips Using Autoruns Tool to Track Startup Applications and Add-onsHow To Get Detailed Information About Your PCSUPERAntiSpyware Portable is the Must-Have Spyware Removal Tool You NeedQuick Tip: Windows Vista Temp Files DirectoryClear Recent Commands From the Run Dialog in Windows XP TouchFreeze Alternative in AutoHotkey The Icy Undertow Desktop Windows Home Server – Backup to LAN The Clear & Clean Desktop Use This Bookmarklet to Easily Get Albums Use AutoHotkey to Assign a Hotkey to a Specific Window Latest Software Reviews Tinyhacker Random Tips Revo Uninstaller Pro Registry Mechanic 9 for Windows PC Tools Internet Security Suite 2010 PCmover Professional 15 Great Illustrations by Chow Hon Lam Easily Sync Files & Folders with Friends & Family Amazon Free Kindle for PC Download Stretch popurls.com with a Stylish Script (Firefox) OldTvShows.org – Find episodes of Hitchcock, Soaps, Game Shows and more Download Microsoft Office Help tab

    Read the article

  • C# Extension Methods - To Extend or Not To Extend...

    - by James Michael Hare
    I've been thinking a lot about extension methods lately, and I must admit I both love them and hate them. They are a lot like sugar, they taste so nice and sweet, but they'll rot your teeth if you eat them too much.   I can't deny that they aren't useful and very handy. One of the major components of the Shared Component library where I work is a set of useful extension methods. But, I also can't deny that they tend to be overused and abused to willy-nilly extend every living type.   So what constitutes a good extension method? Obviously, you can write an extension method for nearly anything whether it is a good idea or not. Many times, in fact, an idea seems like a good extension method but in retrospect really doesn't fit.   So what's the litmus test? To me, an extension method should be like in the movies when a person runs into their twin, separated at birth. You just know you're related. Obviously, that's hard to quantify, so let's try to put a few rules-of-thumb around them.   A good extension method should:     Apply to any possible instance of the type it extends.     Simplify logic and improve readability/maintainability.     Apply to the most specific type or interface applicable.     Be isolated in a namespace so that it does not pollute IntelliSense.     So let's look at a few examples in relation to these rules.   The first rule, to me, is the most important of all. Once again, it bears repeating, a good extension method should apply to all possible instances of the type it extends. It should feel like the long lost relative that should have been included in the original class but somehow was missing from the family tree.    Take this nifty little int extension, I saw this once in a blog and at first I really thought it was pretty cool, but then I started noticing a code smell I couldn't quite put my finger on. So let's look:       public static class IntExtensinos     {         public static int Seconds(int num)         {             return num * 1000;         }           public static int Minutes(int num)         {             return num * 60000;         }     }     This is so you could do things like:       ...     Thread.Sleep(5.Seconds());     ...     proxy.Timeout = 1.Minutes();     ...     Awww, you say, that's cute! Well, that's the problem, it's kitschy and it doesn't always apply (and incidentally you could achieve the same thing with TimeStamp.FromSeconds(5)). It's syntactical candy that looks cool, but tends to rot and pollute the code. It would allow things like:       total += numberOfTodaysOrders.Seconds();     which makes no sense and should never be allowed. The problem is you're applying an extension method to a logical domain, not a type domain. That is, the extension method Seconds() doesn't really apply to ALL ints, it applies to ints that are representative of time that you want to convert to milliseconds.    Do you see what I mean? The two problems, in a nutshell, are that a) Seconds() called off a non-time value makes no sense and b) calling Seconds() off something to pass to something that does not take milliseconds will be off by a factor of 1000 or worse.   Thus, in my mind, you should only ever have an extension method that applies to the whole domain of that type.   For example, this is one of my personal favorites:       public static bool IsBetween<T>(this T value, T low, T high)         where T : IComparable<T>     {         return value.CompareTo(low) >= 0 && value.CompareTo(high) <= 0;     }   This allows you to check if any IComparable<T> is within an upper and lower bound. Think of how many times you type something like:       if (response.Employee.Address.YearsAt >= 2         && response.Employee.Address.YearsAt <= 10)     {     ...     }     Now, you can instead type:       if(response.Employee.Address.YearsAt.IsBetween(2, 10))     {     ...     }     Note that this applies to all IComparable<T> -- that's ints, chars, strings, DateTime, etc -- and does not depend on any logical domain. In addition, it satisfies the second point and actually makes the code more readable and maintainable.   Let's look at the third point. In it we said that an extension method should fit the most specific interface or type possible. Now, I'm not saying if you have something that applies to enumerables, you create an extension for List, Array, Dictionary, etc (though you may have reasons for doing so), but that you should beware of making things TOO general.   For example, let's say we had an extension method like this:       public static T ConvertTo<T>(this object value)     {         return (T)Convert.ChangeType(value, typeof(T));     }         This lets you do more fluent conversions like:       double d = "5.0".ConvertTo<double>();     However, if you dig into Reflector (LOVE that tool) you will see that if the type you are calling on does not implement IConvertible, what you convert to MUST be the exact type or it will throw an InvalidCastException. Now this may or may not be what you want in this situation, and I leave that up to you. Things like this would fail:       object value = new Employee();     ...     // class cast exception because typeof(IEmployee) != typeof(Employee)     IEmployee emp = value.ConvertTo<IEmployee>();       Yes, that's a downfall of working with Convertible in general, but if you wanted your fluent interface to be more type-safe so that ConvertTo were only callable on IConvertibles (and let casting be a manual task), you could easily make it:         public static T ConvertTo<T>(this IConvertible value)     {         return (T)Convert.ChangeType(value, typeof(T));     }         This is what I mean by choosing the best type to extend. Consider that if we used the previous (object) version, every time we typed a dot ('.') on an instance we'd pull up ConvertTo() whether it was applicable or not. By filtering our extension method down to only valid types (those that implement IConvertible) we greatly reduce our IntelliSense pollution and apply a good level of compile-time correctness.   Now my fourth rule is just my general rule-of-thumb. Obviously, you can make extension methods as in-your-face as you want. I included all mine in my work libraries in its own sub-namespace, something akin to:       namespace Shared.Core.Extensions { ... }     This is in a library called Shared.Core, so just referencing the Core library doesn't pollute your IntelliSense, you have to actually do a using on Shared.Core.Extensions to bring the methods in. This is very similar to the way Microsoft puts its extension methods in System.Linq. This way, if you want 'em, you use the appropriate namespace. If you don't want 'em, they won't pollute your namespace.   To really make this work, however, that namespace should only include extension methods and subordinate types those extensions themselves may use. If you plant other useful classes in those namespaces, once a user includes it, they get all the extensions too.   Also, just as a personal preference, extension methods that aren't simply syntactical shortcuts, I like to put in a static utility class and then have extension methods for syntactical candy. For instance, I think it imaginable that any object could be converted to XML:       namespace Shared.Core     {         // A collection of XML Utility classes         public static class XmlUtility         {             ...             // Serialize an object into an xml string             public static string ToXml(object input)             {                 var xs = new XmlSerializer(input.GetType());                   // use new UTF8Encoding here, not Encoding.UTF8. The later includes                 // the BOM which screws up subsequent reads, the former does not.                 using (var memoryStream = new MemoryStream())                 using (var xmlTextWriter = new XmlTextWriter(memoryStream, new UTF8Encoding()))                 {                     xs.Serialize(xmlTextWriter, input);                     return Encoding.UTF8.GetString(memoryStream.ToArray());                 }             }             ...         }     }   I also wanted to be able to call this from an object like:       value.ToXml();     But here's the problem, if i made this an extension method from the start with that one little keyword "this", it would pop into IntelliSense for all objects which could be very polluting. Instead, I put the logic into a utility class so that users have the choice of whether or not they want to use it as just a class and not pollute IntelliSense, then in my extensions namespace, I add the syntactical candy:       namespace Shared.Core.Extensions     {         public static class XmlExtensions         {             public static string ToXml(this object value)             {                 return XmlUtility.ToXml(value);             }         }     }   So now it's the best of both worlds. On one hand, they can use the utility class if they don't want to pollute IntelliSense, and on the other hand they can include the Extensions namespace and use as an extension if they want. The neat thing is it also adheres to the Single Responsibility Principle. The XmlUtility is responsible for converting objects to XML, and the XmlExtensions is responsible for extending object's interface for ToXml().

    Read the article

  • Use BGInfo to Build a Database of System Information of Your Network Computers

    - by Sysadmin Geek
    One of the more popular tools of the Sysinternals suite among system administrators is BGInfo which tacks real-time system information to your desktop wallpaper when you first login. For obvious reasons, having information such as system memory, available hard drive space and system up time (among others) right in front of you is very convenient when you are managing several systems. A little known feature about this handy utility is the ability to have system information automatically saved to a SQL database or some other data file. With a few minutes of setup work you can easily configure BGInfo to record system information of all your network computers in a centralized storage location. You can then use this data to monitor or report on these systems however you see fit. BGInfo Setup If you are familiar with BGInfo, you can skip this section. However, if you have never used this tool, it takes just a few minutes to setup in order to capture the data you are looking for. When you first open BGInfo, a timer will be counting down in the upper right corner. Click the countdown button to keep the interface up so we can edit the settings. Now edit the information you want to capture from the available fields on the right. Since all the output will be redirected to a central location, don’t worry about configuring the layout or formatting. Configuring the Storage Database BGInfo supports the ability to store information in several database formats: SQL Server Database, Access Database, Excel and Text File. To configure this option, open File > Database. Using a Text File The simplest, and perhaps most practical, option is to store the BGInfo data in a comma separated text file. This format allows for the file to be opened in Excel or imported into a database. To use a text file or any other file system type (Excel or MS Access), simply provide the UNC to the respective file. The account running the task to write to this file will need read/write access to both the share and NTFS file permissions. When using a text file, the only option is to have BGInfo create a new entry each time the capture process is run which will add a new line to the respective CSV text file. Using a SQL Database If you prefer to have the data dropped straight into a SQL Server database, BGInfo support this as well. This requires a bit of additional configuration, but overall it is very easy. The first step is to create a database where the information will be stored. Additionally, you will want to create a user account to fill data into this table (and this table only). For your convenience, this script creates a new database and user account (run this as Administrator on your SQL Server machine): @SET Server=%ComputerName%.@SET Database=BGInfo@SET UserName=BGInfo@SET Password=passwordSQLCMD -S “%Server%” -E -Q “Create Database [%Database%]“SQLCMD -S “%Server%” -E -Q “Create Login [%UserName%] With Password=N’%Password%’, DEFAULT_DATABASE=[%Database%], CHECK_EXPIRATION=OFF, CHECK_POLICY=OFF”SQLCMD -S “%Server%” -E -d “%Database%” -Q “Create User [%UserName%] For Login [%UserName%]“SQLCMD -S “%Server%” -E -d “%Database%” -Q “EXEC sp_addrolemember N’db_owner’, N’%UserName%’” Note the SQL user account must have ‘db_owner’ permissions on the database in order for BGInfo to work correctly. This is why you should have a SQL user account specifically for this database. Next, configure BGInfo to connect to this database by clicking on the SQL button. Fill out the connection properties according to your database settings. Select the option of whether or not to only have one entry per computer or keep a history of each system. The data will then be dropped directly into a table named “BGInfoTable” in the respective database.   Configure User Desktop Options While the primary function of BGInfo is to alter the user’s desktop by adding system info as part of the wallpaper, for our use here we want to leave the user’s wallpaper alone so this process runs without altering any of the user’s settings. Click the Desktops button. Configure the Wallpaper modifications to not alter anything.   Preparing the Deployment Now we are all set for deploying the configuration to the individual machines so we can start capturing the system data. If you have not done so already, click the Apply button to create the first entry in your data repository. If all is configured correctly, you should be able to open your data file or database and see the entry for the respective machine. Now click the File > Save As menu option and save the configuration as “BGInfoCapture.bgi”.   Deploying to Client Machines Deployment to the respective client machines is pretty straightforward. No installation is required as you just need to copy the BGInfo.exe and the BGInfoCapture.bgi to each machine and place them in the same directory. Once in place, just run the command: BGInfo.exe BGInfoCapture.bgi /Timer:0 /Silent /NoLicPrompt Of course, you probably want to schedule the capture process to run on a schedule. This command creates a Scheduled Task to run the capture process at 8 AM every morning and assumes you copied the required files to the root of your C drive: SCHTASKS /Create /SC DAILY /ST 08:00 /TN “System Info” /TR “C:\BGInfo.exe C:\BGInfoCapture.bgi /Timer:0 /Silent /NoLicPrompt” Adjust as needed, but the end result is the scheduled task command should look something like this:   Download BGInfo from Sysinternals Latest Features How-To Geek ETC How To Create Your Own Custom ASCII Art from Any Image How To Process Camera Raw Without Paying for Adobe Photoshop How Do You Block Annoying Text Message (SMS) Spam? How to Use and Master the Notoriously Difficult Pen Tool in Photoshop HTG Explains: What Are the Differences Between All Those Audio Formats? How To Use Layer Masks and Vector Masks to Remove Complex Backgrounds in Photoshop Bring Summer Back to Your Desktop with the LandscapeTheme for Chrome and Iron The Prospector – Home Dash Extension Creates a Whole New Browsing Experience in Firefox KinEmote Links Kinect to Windows Why Nobody Reads Web Site Privacy Policies [Infographic] Asian Temple in the Snow Wallpaper 10 Weird Gaming Records from the Guinness Book

    Read the article

  • The Data Scientist

    - by BuckWoody
    A new term - well, perhaps not that new - has come up and I’m actually very excited about it. The term is Data Scientist, and since it’s new, it’s fairly undefined. I’ll explain what I think it means, and why I’m excited about it. In general, I’ve found the term deals at its most basic with analyzing data. Of course, we all do that, and the term itself in that definition is redundant. There is no science that I know of that does not work with analyzing lots of data. But the term seems to refer to more than the common practices of looking at data visually, putting it in a spreadsheet or report, or even using simple coding to examine data sets. The term Data Scientist (as far as I can make out this early in it’s use) is someone who has a strong understanding of data sources, relevance (statistical and otherwise) and processing methods as well as front-end displays of large sets of complicated data. Some - but not all - Business Intelligence professionals have these skills. In other cases, senior developers, database architects or others fill these needs, but in my experience, many lack the strong mathematical skills needed to make these choices properly. I’ve divided the knowledge base for someone that would wear this title into three large segments. It remains to be seen if a given Data Scientist would be responsible for knowing all these areas or would specialize. There are pretty high requirements on the math side, specifically in graduate-degree level statistics, but in my experience a company will only have a few of these folks, so they are expected to know quite a bit in each of these areas. Persistence The first area is finding, cleaning and storing the data. In some cases, no cleaning is done prior to storage - it’s just identified and the cleansing is done in a later step. This area is where the professional would be able to tell if a particular data set should be stored in a Relational Database Management System (RDBMS), across a set of key/value pair storage (NoSQL) or in a file system like HDFS (part of the Hadoop landscape) or other methods. Or do you examine the stream of data without storing it in another system at all? This is an important decision - it’s a foundation choice that deals not only with a lot of expense of purchasing systems or even using Cloud Computing (PaaS, SaaS or IaaS) to source it, but also the skillsets and other resources needed to care and feed the system for a long time. The Data Scientist sets something into motion that will probably outlast his or her career at a company or organization. Often these choices are made by senior developers, database administrators or architects in a company. But sometimes each of these has a certain bias towards making a decision one way or another. The Data Scientist would examine these choices in light of the data itself, starting perhaps even before the business requirements are created. The business may not even be aware of all the strategic and tactical data sources that they have access to. Processing Once the decision is made to store the data, the next set of decisions are based around how to process the data. An RDBMS scales well to a certain level, and provides a high degree of ACID compliance as well as offering a well-known set-based language to work with this data. In other cases, scale should be spread among multiple nodes (as in the case of Hadoop landscapes or NoSQL offerings) or even across a Cloud provider like Windows Azure Table Storage. In fact, in many cases - most of the ones I’m dealing with lately - the data should be split among multiple types of processing environments. This is a newer idea. Many data professionals simply pick a methodology (RDBMS with Star Schemas, NoSQL, etc.) and put all data there, regardless of its shape, processing needs and so on. A Data Scientist is familiar not only with the various processing methods, but how they work, so that they can choose the right one for a given need. This is a huge time commitment, hence the need for a dedicated title like this one. Presentation This is where the need for a Data Scientist is most often already being filled, sometimes with more or less success. The latest Business Intelligence systems are quite good at allowing you to create amazing graphics - but it’s the data behind the graphics that are the most important component of truly effective displays. This is where the mathematics requirement of the Data Scientist title is the most unforgiving. In fact, someone without a good foundation in statistics is not a good candidate for creating reports. Even a basic level of statistics can be dangerous. Anyone who works in analyzing data will tell you that there are multiple errors possible when data just seems right - and basic statistics bears out that you’re on the right track - that are only solvable when you understanding why the statistical formula works the way it does. And there are lots of ways of presenting data. Sometimes all you need is a “yes” or “no” answer that can only come after heavy analysis work. In that case, a simple e-mail might be all the reporting you need. In others, complex relationships and multiple components require a deep understanding of the various graphical methods of presenting data. Knowing which kind of chart, color, graphic or shape conveys a particular datum best is essential knowledge for the Data Scientist. Why I’m excited I love this area of study. I like math, stats, and computing technologies, but it goes beyond that. I love what data can do - how it can help an organization. I’ve been fortunate enough in my professional career these past two decades to work with lots of folks who perform this role at companies from aerospace to medical firms, from manufacturing to retail. Interestingly, the size of the company really isn’t germane here. I worked with one very small bio-tech (cryogenics) company that worked deeply with analysis of complex interrelated data. So  watch this space. No, I’m not leaving Azure or distributed computing or Microsoft. In fact, I think I’m perfectly situated to investigate this role further. We have a huge set of tools, from RDBMS to Hadoop to allow me to explore. And I’m happy to share what I learn along the way.

    Read the article

  • C#: Does an IDisposable in a Halted Iterator Dispose?

    - by James Michael Hare
    If that sounds confusing, let me give you an example. Let's say you expose a method to read a database of products, and instead of returning a List<Product> you return an IEnumerable<Product> in iterator form (yield return). This accomplishes several good things: The IDataReader is not passed out of the Data Access Layer which prevents abstraction leak and resource leak potentials. You don't need to construct a full List<Product> in memory (which could be very big) if you just want to forward iterate once. If you only want to consume up to a certain point in the list, you won't incur the database cost of looking up the other items. This could give us an example like: 1: // a sample data access object class to do standard CRUD operations. 2: public class ProductDao 3: { 4: private DbProviderFactory _factory = SqlClientFactory.Instance 5:  6: // a method that would retrieve all available products 7: public IEnumerable<Product> GetAvailableProducts() 8: { 9: // must create the connection 10: using (var con = _factory.CreateConnection()) 11: { 12: con.ConnectionString = _productsConnectionString; 13: con.Open(); 14:  15: // create the command 16: using (var cmd = _factory.CreateCommand()) 17: { 18: cmd.Connection = con; 19: cmd.CommandText = _getAllProductsStoredProc; 20: cmd.CommandType = CommandType.StoredProcedure; 21:  22: // get a reader and pass back all results 23: using (var reader = cmd.ExecuteReader()) 24: { 25: while(reader.Read()) 26: { 27: yield return new Product 28: { 29: Name = reader["product_name"].ToString(), 30: ... 31: }; 32: } 33: } 34: } 35: } 36: } 37: } The database details themselves are irrelevant. I will say, though, that I'm a big fan of using the System.Data.Common classes instead of your provider specific counterparts directly (SqlCommand, OracleCommand, etc). This lets you mock your data sources easily in unit testing and also allows you to swap out your provider in one line of code. In fact, one of the shared components I'm most proud of implementing was our group's DatabaseUtility library that simplifies all the database access above into one line of code in a thread-safe and provider-neutral way. I went with my own flavor instead of the EL due to the fact I didn't want to force internal company consumers to use the EL if they didn't want to, and it made it easy to allow them to mock their database for unit testing by providing a MockCommand, MockConnection, etc that followed the System.Data.Common model. One of these days I'll blog on that if anyone's interested. Regardless, you often have situations like the above where you are consuming and iterating through a resource that must be closed once you are finished iterating. For the reasons stated above, I didn't want to return IDataReader (that would force them to remember to Dispose it), and I didn't want to return List<Product> (that would force them to hold all products in memory) -- but the first time I wrote this, I was worried. What if you never consume the last item and exit the loop? Are the reader, command, and connection all disposed correctly? Of course, I was 99.999999% sure the creators of C# had already thought of this and taken care of it, but inspection in Reflector was difficult due to the nature of the state machines yield return generates, so I decided to try a quick example program to verify whether or not Dispose() will be called when an iterator is broken from outside the iterator itself -- i.e. before the iterator reports there are no more items. So I wrote a quick Sequencer class with a Dispose() method and an iterator for it. Yes, it is COMPLETELY contrived: 1: // A disposable sequence of int -- yes this is completely contrived... 2: internal class Sequencer : IDisposable 3: { 4: private int _i = 0; 5: private readonly object _mutex = new object(); 6:  7: // Constructs an int sequence. 8: public Sequencer(int start) 9: { 10: _i = start; 11: } 12:  13: // Gets the next integer 14: public int GetNext() 15: { 16: lock (_mutex) 17: { 18: return _i++; 19: } 20: } 21:  22: // Dispose the sequence of integers. 23: public void Dispose() 24: { 25: // force output immediately (flush the buffer) 26: Console.WriteLine("Disposed with last sequence number of {0}!", _i); 27: Console.Out.Flush(); 28: } 29: } And then I created a generator (infinite-loop iterator) that did the using block for auto-Disposal: 1: // simply defines an extension method off of an int to start a sequence 2: public static class SequencerExtensions 3: { 4: // generates an infinite sequence starting at the specified number 5: public static IEnumerable<int> GetSequence(this int starter) 6: { 7: // note the using here, will call Dispose() when block terminated. 8: using (var seq = new Sequencer(starter)) 9: { 10: // infinite loop on this generator, means must be bounded by caller! 11: while(true) 12: { 13: yield return seq.GetNext(); 14: } 15: } 16: } 17: } This is really the same conundrum as the database problem originally posed. Here we are using iteration (yield return) over a large collection (infinite sequence of integers). If we cut the sequence short by breaking iteration, will that using block exit and hence, Dispose be called? Well, let's see: 1: // The test program class 2: public class IteratorTest 3: { 4: // The main test method. 5: public static void Main() 6: { 7: Console.WriteLine("Going to consume 10 of infinite items"); 8: Console.Out.Flush(); 9:  10: foreach(var i in 0.GetSequence()) 11: { 12: // could use TakeWhile, but wanted to output right at break... 13: if(i >= 10) 14: { 15: Console.WriteLine("Breaking now!"); 16: Console.Out.Flush(); 17: break; 18: } 19:  20: Console.WriteLine(i); 21: Console.Out.Flush(); 22: } 23:  24: Console.WriteLine("Done with loop."); 25: Console.Out.Flush(); 26: } 27: } So, what do we see? Do we see the "Disposed" message from our dispose, or did the Dispose get skipped because from an "eyeball" perspective we should be locked in that infinite generator loop? Here's the results: 1: Going to consume 10 of infinite items 2: 0 3: 1 4: 2 5: 3 6: 4 7: 5 8: 6 9: 7 10: 8 11: 9 12: Breaking now! 13: Disposed with last sequence number of 11! 14: Done with loop. Yes indeed, when we break the loop, the state machine that C# generates for yield iterate exits the iteration through the using blocks and auto-disposes the IDisposable correctly. I must admit, though, the first time I wrote one, I began to wonder and that led to this test. If you've never seen iterators before (I wrote a previous entry here) the infinite loop may throw you, but you have to keep in mind it is not a linear piece of code, that every time you hit a "yield return" it cedes control back to the state machine generated for the iterator. And this state machine, I'm happy to say, is smart enough to clean up the using blocks correctly. I suspected those wily guys and gals at Microsoft engineered it well, and I wasn't disappointed. But, I've been bitten by assumptions before, so it's good to test and see. Yes, maybe you knew it would or figured it would, but isn't it nice to know? And as those campy 80s G.I. Joe cartoon public service reminders always taught us, "Knowing is half the battle...". Technorati Tags: C#,.NET

    Read the article

  • Service Broker, not ETL

    - by jamiet
    I have been very quiet on this blog of late and one reason for that is I have been very busy on a client project that I would like to talk about a little here. The client that I have been working for has a website that runs on a distributed architecture utilising a messaging infrastructure for communication between different endpoints. My brief was to build a system that could consume these messages and produce analytical information in near-real-time. More specifically I basically had to deliver a data warehouse however it was the real-time aspect of the project that really intrigued me. This real-time requirement meant that using an Extract transformation, Load (ETL) tool was out of the question and so I had no choice but to write T-SQL code (i.e. stored-procedures) to process the incoming messages and load the data into the data warehouse. This concerned me though – I had no way to control the rate at which data would arrive into the system yet we were going to have end-users querying the system at the same time that those messages were arriving; the potential for contention in such a scenario was pretty high and and was something I wanted to minimise as much as possible. Moreover I did not want the processing of data inside the data warehouse to have any impact on the customer-facing website. As you have probably guessed from the title of this blog post this is where Service Broker stepped in! For those that have not heard of it Service Broker is a queuing technology that has been built into SQL Server since SQL Server 2005. It provides a number of features however the one that was of interest to me was the fact that it facilitates asynchronous data processing which, in layman’s terms, means the ability to process some data without requiring the system that supplied the data having to wait for the response. That was a crucial feature because on this project the customer-facing website (in effect an OLTP system) would be calling one of our stored procedures with each message – we did not want to cause the OLTP system to wait on us every time we processed one of those messages. This asynchronous nature also helps to alleviate the contention problem because the asynchronous processing activity is handled just like any other task in the database engine and hence can wait on another task (such as an end-user query). Service Broker it was then! The stored procedure called by the OLTP system would simply put the message onto a queue and we would use a feature called activation to pick each message off the queue in turn and process it into the warehouse. At the time of writing the system is not yet up to full capacity but so far everything seems to be working OK (touch wood) and crucially our users are seeing data in near-real-time. By near-real-time I am talking about latencies of a few minutes at most and to someone like me who is used to building systems that have overnight latencies that is a huge step forward! So then, am I advocating that you all go out and dump your ETL tools? Of course not, no! What this project has taught me though is that in certain scenarios there may be better ways to implement a data warehouse system then the traditional “load data in overnight” approach that we are all used to. Moreover I have really enjoyed getting to grips with a new technology and even if you don’t want to use Service Broker you might want to consider asynchronous messaging architectures for your BI/data warehousing solutions in the future. This has been a very high level overview of my use of Service Broker and I have deliberately left out much of the minutiae of what has been a very challenging implementation. Nonetheless I hope I have caused you to reflect upon your own approaches to BI and question whether other approaches may be more tenable. All comments and questions gratefully received! Lastly, if you have never used Service Broker before and want to kick the tyres I have provided below a very simple “Service Broker Hello World” script that will create all of the objects required to facilitate Service Broker communications and then send the message “Hello World” from one place to anther! This doesn’t represent a “proper” implementation per se because it doesn’t close down down conversation objects (which you should always do in a real-world scenario) but its enough to demonstrate the capabilities! @Jamiet ----------------------------------------------------------------------------------------------- /*This is a basic Service Broker Hello World app. Have fun! -Jamie */ USE MASTER GO CREATE DATABASE SBTest GO --Turn Service Broker on! ALTER DATABASE SBTest SET ENABLE_BROKER GO USE SBTest GO -- 1) we need to create a message type. Note that our message type is -- very simple and allowed any type of content CREATE MESSAGE TYPE HelloMessage VALIDATION = NONE GO -- 2) Once the message type has been created, we need to create a contract -- that specifies who can send what types of messages CREATE CONTRACT HelloContract (HelloMessage SENT BY INITIATOR) GO --We can query the metadata of the objects we just created SELECT * FROM   sys.service_message_types WHERE name = 'HelloMessage'; SELECT * FROM   sys.service_contracts WHERE name = 'HelloContract'; SELECT * FROM   sys.service_contract_message_usages WHERE  service_contract_id IN (SELECT service_contract_id FROM sys.service_contracts WHERE name = 'HelloContract') AND        message_type_id IN (SELECT message_type_id FROM sys.service_message_types WHERE name = 'HelloMessage'); -- 3) The communication is between two endpoints. Thus, we need two queues to -- hold messages CREATE QUEUE SenderQueue CREATE QUEUE ReceiverQueue GO --more querying metatda SELECT * FROM sys.service_queues WHERE name IN ('SenderQueue','ReceiverQueue'); --we can also select from the queues as if they were tables SELECT * FROM SenderQueue   SELECT * FROM ReceiverQueue   -- 4) Create the required services and bind them to be above created queues CREATE SERVICE Sender   ON QUEUE SenderQueue CREATE SERVICE Receiver   ON QUEUE ReceiverQueue (HelloContract) GO --more querying metadata SELECT * FROM sys.services WHERE name IN ('Receiver','Sender'); -- 5) At this point, we can begin the conversation between the two services by -- sending messages DECLARE @conversationHandle UNIQUEIDENTIFIER DECLARE @message NVARCHAR(100) BEGIN   BEGIN TRANSACTION;   BEGIN DIALOG @conversationHandle         FROM SERVICE Sender         TO SERVICE 'Receiver'         ON CONTRACT HelloContract WITH ENCRYPTION=OFF   -- Send a message on the conversation   SET @message = N'Hello, World';   SEND  ON CONVERSATION @conversationHandle         MESSAGE TYPE HelloMessage (@message)   COMMIT TRANSACTION END GO --check contents of queues SELECT * FROM SenderQueue   SELECT * FROM ReceiverQueue   GO -- Receive a message from the queue RECEIVE CONVERT(NVARCHAR(MAX), message_body) AS MESSAGE FROM ReceiverQueue GO --If no messages were received and/or you can't see anything on the queues you may wish to check the following for clues: SELECT * FROM sys.transmission_queue -- Cleanup DROP SERVICE Sender DROP SERVICE Receiver DROP QUEUE SenderQueue DROP QUEUE ReceiverQueue DROP CONTRACT HelloContract DROP MESSAGE TYPE HelloMessage GO USE MASTER GO DROP DATABASE SBTest GO

    Read the article

  • CodePlex Daily Summary for Tuesday, March 23, 2010

    CodePlex Daily Summary for Tuesday, March 23, 2010New Projects.NET StarCraft II Replay Parser: A .NET 3.5 Library used to parse StarCraft II replays. Developed in C# 3.5.BackToBasics "B2B" Chat: With technology and software getting more and more complicated, why not get back to basics with BackToBasicsChat. B2B allows you to chat over a ser...Dark Neuron Game Engine: Dark Neuron allows you to easily create fun and interesting games with no need of developing basic game components. This engine is developed for C#...DeepZoom Pivot Constructor: Library to make building DeepZoom images and Pivot displays simpler.ePaper reader: The project is aimed at creating a tool which helps in reading electronic editions of news papers(pdf/flash)FSharpPageProvider for EPiServer CMS 6: This project starts as the port of EPiServer XmlPageProvider to F# programming language. Hammock for REST: Hammock is a REST library for .NET that greatly simplifies consuming and wrapping RESTful services.Kirill Osenkov: Various small projects, tools, utilities and samples by Kirill OsenkovliveDB: liveDB - web client for sql serverLucilla Framework: lucilla frameworkMVC Foolproof Validation: MVC Foolproof Validation aims to extend the Data Annotation validation provided in ASP.NET MVC. Initial efforts are focused on adding contingent va...MVC2Forums: MVC2Forums is simply a forum system based upon MVC2.Mvvm Foundation Silverlight: Mvvm Foundation Silverlight is a library of classes that are helpful when building Silverlight applications based in the MVVM pattern. This librar...MyPersonalWebsite: This is my personal web site developed using ASP.NET MVC 2Planner: Planner makes it easier for all peoples to plan your tasks. It's developed in Delphi.Prose: Prose is an playground for an experimental JavaScript like language compiler. Eventually it will implement 0-CFA, CFA2, and a Tracing JITQuestTracker: QuestTracker is a todo list presented in the format of a quest tracking list such as the one in World of Warcraft.SevenZipLib: SevenZipLib is a C# interface to the 7-zip library.SimpleGeo .Net: .Net Client library for the SimpleGeo.com serviceNew ReleasesAutenticar no OpenLDAP utilizando pGIna: DLL LDAPAuth Plus: New Group: No LoginBMap.NET: BMap.NET 2: This is the 2nd version of BMap.NET. It has included these tags: Bing Maps, and "About BMap.NET".Cronos: Version 2.04: This is primarily a bug-fix release. Several numerical issues have been resolved, and a resource leak (of MS Windows graphics objects) has been fi...EV Dashboard: v1.0: This release includes support for an App.config file and Auto Connect, which will connect to the specified BMS at startup. Note: You still have to ...GKO Libraries: GKO Libraries 0.1 Beta: 0.1 Beta Added More utilities and functions RefactoringsGLB Virtual Player Builder: 0.4.2 Beta: Beta build that includes a new player creator.HKGolden Express: HKGoldenExpress (Build 201003222215): New features: (None) Bug fix: Fixed bugs of unable to parse XML file stream returned from HKGolden API, as the encoding of XML file stream chang...jQuery Web Controls ASP.Net: jQueryWebControls 1.1.1.2: En esta versión se han corregido problemas existentes en la ejecución de los scripts de jquery cuando se utilizaban MasterPage y/o Ajax Control Too...LightKit: Version 0.2.2: Fixed: fixed bug when CollectionItemsEditor ditermines IsChanged property incorrectly fixed ObjectEditor`a thisstring propertyName method wrong l...LINQ to Twitter: LINQ to Twitter Beta v2.0.8: New items added since v1.1 include: Support for OAuth (via DotNetOpenAuth), secure communication via https, VB language support, serialization of ...MapWindow6: MapWindow 6.0 msi (March 22): This version fixes the icons for the desktop installer and changes the install directory to Program Files\MapWindow instead of Program Files\ISU.Math.NET Numerics: 2010.3.22.1334 Build: Latest alpha buildMiniCalendar Web Part: MiniCalendar Web Part 1.8: A small web part to display links to events stored in a list (or document library) in a mini calendar (in month view mode). It shows tooltips for t...OCInject: Release Two: This release brings some missing features such as Singleton support, Func<T> factories and child containers. It, also, has an updated constructor ...Phalanger - The PHP Language Compiler for the .NET Framework: 2.0 (March 2010): Installer of the latest binaries of Phalanger 2.0 (March 2010) and its integration into Visual Studio 2008. Easy installer with automatic IIS int...Planner: Planner: firstQuestTracker: QuestTracker 0.1: This is the preliminary release of QuestTracker. There's not much documentation or many features yet, but it is functional. Any feedback would be a...QuestTracker: QuestTracker 0.1.1: Bugfix for QuestTracker 0.1QuestTracker: QuestTracker 0.1.2: Fixes an issue with saving the quest list.Rawr: Rawr 2.3.13: We're pleased to announce that, after long last, Rawr3 has entered public beta. You're still welcome to continue using Rawr2 (that's what you're re...Single Web Session: Alpha Model Plugin: !How to use Single Web Session add following line into your web config <httpModules> <add name="SingleSession" type="SingleWebSession.Model.W...SMIL - SharePoint Map Integration Layer: SMIL 1.0: Custom data field Extracts Lat/Lon from EXIF from images being uploaded. Map Web Part Filter with SharePoint views Filter by connecting to...sTASKedit: sTASKedit 42532 (Developer Alpha): This release is only to verify the currently decoded task structure... Supported files: tasks.data (v1.3.6 client)VCC: Latest build, v2.1.30322.0: Automatic drop of latest buildVisual Studio DSite: Advanced Notepad (Visual C++ 2008): An notepad written in c that can save in a rich text file format.Wallpaper Rotator: Wallpaper Rotator 0.5: Wallpaper Rotator 0.5 This version includes the following improvements: Saving the choice of "Random Order (Shuffle Mode)" Updating the configu...Most Popular ProjectsMetaSharpRawrWBFS ManagerSilverlight ToolkitASP.NET Ajax LibraryMicrosoft SQL Server Product Samples: DatabaseAJAX Control ToolkitLiveUpload to FacebookWindows Presentation Foundation (WPF)ASP.NETMost Active ProjectsRawrjQuery Library for SharePoint Web ServicesBlogEngine.NETLINQ to TwitterPHPExcelFarseer Physics EngineFacebook Developer ToolkitNB_Store - Free DotNetNuke Ecommerce Catalog Modulepatterns & practices: Composite WPF and SilverlightN2 CMS

    Read the article

  • Fraud Detection with the SQL Server Suite Part 2

    - by Dejan Sarka
    This is the second part of the fraud detection whitepaper. You can find the first part in my previous blog post about this topic. My Approach to Data Mining Projects It is impossible to evaluate the time and money needed for a complete fraud detection infrastructure in advance. Personally, I do not know the customer’s data in advance. I don’t know whether there is already an existing infrastructure, like a data warehouse, in place, or whether we would need to build one from scratch. Therefore, I always suggest to start with a proof-of-concept (POC) project. A POC takes something between 5 and 10 working days, and involves personnel from the customer’s site – either employees or outsourced consultants. The team should include a subject matter expert (SME) and at least one information technology (IT) expert. The SME must be familiar with both the domain in question as well as the meaning of data at hand, while the IT expert should be familiar with the structure of data, how to access it, and have some programming (preferably Transact-SQL) knowledge. With more than one IT expert the most time consuming work, namely data preparation and overview, can be completed sooner. I assume that the relevant data is already extracted and available at the very beginning of the POC project. If a customer wants to have their people involved in the project directly and requests the transfer of knowledge, the project begins with training. I strongly advise this approach as it offers the establishment of a common background for all people involved, the understanding of how the algorithms work and the understanding of how the results should be interpreted, a way of becoming familiar with the SQL Server suite, and more. Once the data has been extracted, the customer’s SME (i.e. the analyst), and the IT expert assigned to the project will learn how to prepare the data in an efficient manner. Together with me, knowledge and expertise allow us to focus immediately on the most interesting attributes and identify any additional, calculated, ones soon after. By employing our programming knowledge, we can, for example, prepare tens of derived variables, detect outliers, identify the relationships between pairs of input variables, and more, in only two or three days, depending on the quantity and the quality of input data. I favor the customer’s decision of assigning additional personnel to the project. For example, I actually prefer to work with two teams simultaneously. I demonstrate and explain the subject matter by applying techniques directly on the data managed by each team, and then both teams continue to work on the data overview and data preparation under our supervision. I explain to the teams what kind of results we expect, the reasons why they are needed, and how to achieve them. Afterwards we review and explain the results, and continue with new instructions, until we resolve all known problems. Simultaneously with the data preparation the data overview is performed. The logic behind this task is the same – again I show to the teams involved the expected results, how to achieve them and what they mean. This is also done in multiple cycles as is the case with data preparation, because, quite frankly, both tasks are completely interleaved. A specific objective of the data overview is of principal importance – it is represented by a simple star schema and a simple OLAP cube that will first of all simplify data discovery and interpretation of the results, and will also prove useful in the following tasks. The presence of the customer’s SME is the key to resolving possible issues with the actual meaning of the data. We can always replace the IT part of the team with another database developer; however, we cannot conduct this kind of a project without the customer’s SME. After the data preparation and when the data overview is available, we begin the scientific part of the project. I assist the team in developing a variety of models, and in interpreting the results. The results are presented graphically, in an intuitive way. While it is possible to interpret the results on the fly, a much more appropriate alternative is possible if the initial training was also performed, because it allows the customer’s personnel to interpret the results by themselves, with only some guidance from me. The models are evaluated immediately by using several different techniques. One of the techniques includes evaluation over time, where we use an OLAP cube. After evaluating the models, we select the most appropriate model to be deployed for a production test; this allows the team to understand the deployment process. There are many possibilities of deploying data mining models into production; at the POC stage, we select the one that can be completed quickly. Typically, this means that we add the mining model as an additional dimension to an existing DW or OLAP cube, or to the OLAP cube developed during the data overview phase. Finally, we spend some time presenting the results of the POC project to the stakeholders and managers. Even from a POC, the customer will receive lots of benefits, all at the sole risk of spending money and time for a single 5 to 10 day project: The customer learns the basic patterns of frauds and fraud detection The customer learns how to do the entire cycle with their own people, only relying on me for the most complex problems The customer’s analysts learn how to perform much more in-depth analyses than they ever thought possible The customer’s IT experts learn how to perform data extraction and preparation much more efficiently than they did before All of the attendees of this training learn how to use their own creativity to implement further improvements of the process and procedures, even after the solution has been deployed to production The POC output for a smaller company or for a subsidiary of a larger company can actually be considered a finished, production-ready solution It is possible to utilize the results of the POC project at subsidiary level, as a finished POC project for the entire enterprise Typically, the project results in several important “side effects” Improved data quality Improved employee job satisfaction, as they are able to proactively contribute to the central knowledge about fraud patterns in the organization Because eventually more minds get to be involved in the enterprise, the company should expect more and better fraud detection patterns After the POC project is completed as described above, the actual project would not need months of engagement from my side. This is possible due to our preference to transfer the knowledge onto the customer’s employees: typically, the customer will use the results of the POC project for some time, and only engage me again to complete the project, or to ask for additional expertise if the complexity of the problem increases significantly. I usually expect to perform the following tasks: Establish the final infrastructure to measure the efficiency of the deployed models Deploy the models in additional scenarios Through reports By including Data Mining Extensions (DMX) queries in OLTP applications to support real-time early warnings Include data mining models as dimensions in OLAP cubes, if this was not done already during the POC project Create smart ETL applications that divert suspicious data for immediate or later inspection I would also offer to investigate how the outcome could be transferred automatically to the central system; for instance, if the POC project was performed in a subsidiary whereas a central system is available as well Of course, for the actual project, I would repeat the data and model preparation as needed It is virtually impossible to tell in advance how much time the deployment would take, before we decide together with customer what exactly the deployment process should cover. Without considering the deployment part, and with the POC project conducted as suggested above (including the transfer of knowledge), the actual project should still only take additional 5 to 10 days. The approximate timeline for the POC project is, as follows: 1-2 days of training 2-3 days for data preparation and data overview 2 days for creating and evaluating the models 1 day for initial preparation of the continuous learning infrastructure 1 day for presentation of the results and discussion of further actions Quite frequently I receive the following question: are we going to find the best possible model during the POC project, or during the actual project? My answer is always quite simple: I do not know. Maybe, if we would spend just one hour more for data preparation, or create just one more model, we could get better patterns and predictions. However, we simply must stop somewhere, and the best possible way to do this, according to my experience, is to restrict the time spent on the project in advance, after an agreement with the customer. You must also never forget that, because we build the complete learning infrastructure and transfer the knowledge, the customer will be capable of doing further investigations independently and improve the models and predictions over time without the need for a constant engagement with me.

    Read the article

  • NVIDIA x server - "sudo nvidia config" does not generate a working 'xorg.config'

    - by Mike
    I am over 18 hours deep on this challenge. I got to this point and am stuck. very stuck. Maybe you can figure it out? Ubuntu Version 12.04 LTS with all the updates installed. Problem: The default settings in "etc/X11/xorg.conf" that are generated by the "nvidia-xconfig" tool, do not allow the NVIDIA x server to connect to the driver in my "System Settings Additional Driver window". (that's how I understand it. Lots of information below). Symptoms of Problem "System Settings Additional Driver" window has drivers, but the nvidia x server cannot connect/utilize any of the 4 drivers. the drivers are activated, but not in use. When I go to "System Tools Administration NVIDIA x server settings" I get an error that basically tells me to create a default file to initialize the NVIDIA X server (screen shot below). This is the messages the terminal gives after running a "sudo nvidia-xconfig" command for the first time. It seems that the generated file by the tool i just ran is generating a bad/unusable file: If I run the "sudo nvidia-xconfig" command again, I wont get an error the second time. However when I reboot, the default file that is generated (etc/X11/xorg.conf) simply puts the screen resolution at 800 x 600 (or something big like that). When I try to go to NVIDIA x server settings I am greeted with the same screen as the screen shot as in symptom 2 (no option to change the resolution). If I try to go to "system settings display" there are no other resolutions to choose from. At this point I must delete the newly minted "xorg.conf" and reinstate the original in its place. Here are the contents of the "xorg.conf" that is generated first (the one missing required information): # nvidia-xconfig: X configuration file generated by nvidia-xconfig # nvidia-xconfig: version 304.88 (buildmeister@swio-display-x86-rhel47-06) Wed Mar 27 15:32:58 PDT 2013 Section "ServerLayout" Identifier "Layout0" Screen 0 "Screen0" InputDevice "Keyboard0" "CoreKeyboard" InputDevice "Mouse0" "CorePointer" EndSection Section "Files" EndSection Section "InputDevice" # generated from default Identifier "Mouse0" Driver "mouse" Option "Protocol" "auto" Option "Device" "/dev/psaux" Option "Emulate3Buttons" "no" Option "ZAxisMapping" "4 5" EndSection Section "InputDevice" # generated from default Identifier "Keyboard0" Driver "kbd" EndSection Section "Monitor" Identifier "Monitor0" VendorName "Unknown" ModelName "Unknown" HorizSync 28.0 - 33.0 VertRefresh 43.0 - 72.0 Option "DPMS" EndSection Section "Device" Identifier "Device0" Driver "nvidia" VendorName "NVIDIA Corporation" EndSection Section "Screen" Identifier "Screen0" Device "Device0" Monitor "Monitor0" DefaultDepth 24 SubSection "Display" Depth 24 EndSubSection EndSection Hardware: I ran the "lspci|grep VGA". There results are: 00:02.0 VGA compatible controller: Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller (rev 09) 01:00.0 VGA compatible controller: NVIDIA Corporation GF108 [Quadro 1000M] (rev a1) More Hardware info: Ram: 16GB CPU: Intel Core i7-2720QM @2.2GHz * 8 Other: 64 bit. This is a triple boot computer and not a VM. Attempts With Not Success on My End: 1) Tried to append the "xorg.conf" with what I perceive is missing information and obviously it didn't fly. 2) All the other stuff I tried got me to this point. 3) See if this link is helpful to you (I barely get it, but i get enough knowing that a smarter person might find this useful): http://manpages.ubuntu.com/manpages/lucid/man1/nvidia-xconfig.1.html 4) I am completely new to Linux (40 hours over past week), but not to programming. However I am very serious about changing over to Linux. When you respond (I hope someone responds...) please respond in a way that a person new to Linux can understand. 5) By the way, the reason I am in this mess is because I MUST have a second monitor running from my laptop, and "System Settings Display" doesn't recognize my second display. I know it is possible to make the second display work in my system, because when I boot from the install CD, I perform work on the native laptop monitor, but the second monitor shows a purple screen with Ubuntu in the middle, so I know the VGA port is sending a signal out. If this is too much for you to tackle please suggest an alternative method to get a second display. I don't want to go to windows but I cannot have a single display. I am really fudged here. I hope some smart person can help. Thanks in advance. Mike. **********************EDIT #1********************** More Details About Graphics Card I was asked "which brand of nvidia-card do you have exactly?" Here is what I did to provide more info (maybe relevant, maybe not, but here is everything): 1) Took my Lenovo W520 right apart to see if there is an identifier on the actual card. However I realized that if I get deep enough to take a look, the laptop "won't like it". so I put it back together. Figuring out the card this way is not an option for me right now. 2) (My computer is triple boot) I logged into Win7 and ran 'dxdiag' command. here is the screen shot: 3) I tried to look on the lenovo website for more details... but no luck. I took a look at my receipts and here is info form receipt: System Unit: W520 NVIDIA Quadro 1000M 2GB 4) In win7 I went to the NVIDIA website and used the option to have my card 'scanned' by a Java applet to determine the latest update for my card. I tried the same with Ubuntu but I can't get the applet to run. Here is the recommended driver from from the NVIDIA Applet for my card for Win7 (I hope this shines some light on the specifics of the card): Quadro/NVS/Tesla/GRID Desktop Driver Release R319 Version: 320.00 WHQL Release Date: 3.5.2013 5) Also I went on the NVIDIA driver search and looked through every possible combination of product type + product series + product to find all the combinations that yield a 1000M card. My card is: Product Type: Quadro Product Series: Quadro Series (Notebooks) Product: 1000M ***********************EDIT #2******************* Additional Symptoms Another question that generated more symptoms I previously didn't mention was: "After generating xorg.conf by nvidia-xconfig, go to additional drivers, do you see nvidia-304?" 1) I took a screen shot of the "additional drivers" right after generating xorg.conf by nvidia-xconfig. Here it is: 2) Then I did a reboot. Now Ubuntu is 600 x 800 resolution. When I logged in after the computer came up I got an error (which I always get after generating xorg.conf by nvidia-xconfig and rebooting) 3) To finally answer the question - No. There is no "NVIDIA-304" driver. Screen shot of additional drivers after generating xorg.conf by nvidia-xconfig and rebooting : At this point I revert to the original xorg.conf and delete the xorg.conf generated by Nvidia.

    Read the article

  • SQL SERVER – Using expressor Composite Types to Enforce Business Rules

    - by pinaldave
    One of the features that distinguish the expressor Data Integration Platform from other products in the data integration space is its concept of composite types, which provide an effective and easily reusable way to clearly define the structure and characteristics of data within your application.  An important feature of the composite type approach is that it allows you to easily adjust the content of a record to its ultimate purpose.  For example, a record used to update a row in a database table is easily defined to include only the minimum set of columns, that is, a value for the key column and values for only those columns that need to be updated. Much like a class in higher level programming languages, you can also use the composite type as a way to enforce business rules onto your data by encapsulating a datum’s name, data type, and constraints (for example, maximum, minimum, or acceptable values) as a single entity, which ensures that your data can not assume an invalid value.  To what extent you use this functionality is a decision you make when designing your application; the expressor design paradigm does not force this approach on you. Let’s take a look at how these features are used.  Suppose you want to create a group of applications that maintain the employee table in your human resources database. Your table might have a structure similar to the HumanResources.Employee table in the AdventureWorks database.  This table includes two columns, EmployeID and rowguid, that are maintained by the relational database management system; you cannot provide values for these columns when inserting new rows into the table. Additionally, there are columns such as VacationHours and SickLeaveHours that you might choose to update for all employees on a monthly basis, which justifies creation of a dedicated application. By creating distinct composite types for the read, insert and update operations against this table, you can more easily manage this table’s content. When developing this application within expressor Studio, your first task is to create a schema artifact for the database table.  This process is completely driven by a wizard, only requiring that you select the desired database schema and table.  The resulting schema artifact defines the mapping of result set records to a record within the expressor data integration application.  The structure of the record within the expressor application is a composite type that is given the default name CompositeType1.  As you can see in the following figure, all columns from the table are included in the result set and mapped to an identically named attribute in the default composite type. If you are developing an application that needs to read this table, perhaps to prepare a year-end report of employees by department, you would probably not be interested in the data in the rowguid and ModifiedDate columns.  A typical approach would be to drop this unwanted data in a downstream operator.  But using an alternative composite type provides a better approach in which the unwanted data never enters your application. While working in expressor  Studio’s schema editor, simply create a second composite type within the same schema artifact, which you could name ReadTable, and remove the attributes corresponding to the unwanted columns. The value of an alternative composite type is even more apparent when you want to insert into or update the table.  In the composite type used to insert rows, remove the attributes corresponding to the EmployeeID primary key and rowguid uniqueidentifier columns since these values are provided by the relational database management system. And to update just the VacationHours and SickLeaveHours columns, use a composite type that includes only the attributes corresponding to the EmployeeID, VacationHours, SickLeaveHours and ModifiedDate columns. By specifying this schema artifact and composite type in a Write Table operator, your upstream application need only deal with the four required attributes and there is no risk of unintentionally overwriting a value in a column that does not need to be updated. Now, what about the option to use the composite type to enforce business rules?  If you review the composition of the default composite type CompositeType1, you will note that the constraints defined for many of the attributes mirror the table column specifications.  For example, the maximum number of characters in the NationaIDNumber, LoginID and Title attributes is equivalent to the maximum width of the target column, and the size of the MaritalStatus and Gender attributes is limited to a single character as required by the table column definition.  If your application code leads to a violation of these constraints, an error will be raised.  The expressor design paradigm then allows you to handle the error in a way suitable for your application.  For example, a string value could be truncated or a numeric value could be rounded. Moreover, you have the option of specifying additional constraints that support business rules unrelated to the table definition. Let’s assume that the only acceptable values for marital status are S, M, and D.  Within the schema editor, double-click on the MaritalStatus attribute to open the Edit Attribute window.  Then click the Allowed Values checkbox and enter the acceptable values into the Constraint Value text box. The schema editor is updated accordingly. There is one more option that the expressor semantic type paradigm supports.  Since the MaritalStatus attribute now clearly specifies how this type of information should be represented (a single character limited to S, M or D), you can convert this attribute definition into a shared type, which will allow you to quickly incorporate this definition into another composite type or into the description of an output record from a transform operator. Again, double-click on the MaritalStatus attribute and in the Edit Attribute window, click Convert, which opens the Share Local Semantic Type window that you use to name this shared type.  There’s no requirement that you give the shared type the same name as the attribute from which it was derived.  You should supply a name that makes it obvious what the shared type represents. In this posting, I’ve overviewed the expressor semantic type paradigm and shown how it can be used to make your application development process more productive.  The beauty of this feature is that you choose when and to what extent you utilize the functionality, but I’m certain that if you opt to follow this approach your efforts will become more efficient and your work will progress more quickly.  As always, I encourage you to download and evaluate expressor Studio for your current and future data integration needs. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: CodeProject, Pinal Dave, PostADay, SQL, SQL Authority, SQL Documentation, SQL Query, SQL Server, SQL Tips and Tricks, SQLServer, T SQL, Technology

    Read the article

  • Integrate Google Wave With Your Windows Workflow

    - by Matthew Guay
    Have you given Google Wave a try, only to find it difficult to keep up with?  Here’s how you can integrate Google Wave with your desktop and workflow with some free and simple apps. Google Wave is an online web app, and unlike many Google services, it’s not easily integrated with standard desktop applications.  Instead, you’ll have to keep it open in a browser tab, and since it is one of the most intensive HTML5 webapps available today, you may notice slowdowns in many popular browsers.  Plus, it can be hard to stay on top of your Wave conversations and collaborations by just switching back and forth between the website and whatever else you’re working on.  Here we’ll look at some tools that can help you integrate Google Wave with your workflow, and make it feel more native in Windows. Use Google Wave Directly in Windows What’s one of the best ways to make a web app feel like a native application?  By making it into a native application, of course!  Waver is a free Air powered app that can make the mobile version of Google Wave feel at home on your Windows, Mac, or Linux desktop.  We found it to be a quick and easy way to keep on top of our waves and collaborate with our friends. To get started with Waver, open their homepage on the Adobe Air Marketplace (link below) and click Download From Publisher. Waver is powered by Adobe Air, so if you don’t have Adobe Air installed, you’ll need to first download and install it. After clicking the link above, Adobe Air will open a prompt asking what you wish to do with the file.  Click Open, and then install as normal. Once the installation is finished, enter your Google Account info in the window.   After a few moments, you’ll see your Wave account in miniature, running directly in Waver.  Click a Wave to view it, or click New wave to start a new Wave message.  Unfortunately, in our tests the search box didn’t seem to work, but everything else worked fine. Google Wave works great in Waver, though all of the Wave features are not available since it is running the mobile version of Wave. You can still view content from plugins, including YouTube videos, directly in Waver.   Get Wave Notifications From Your Windows Taskbar Most popular email and Twitter clients give you notifications from your system tray when new messages come in.  And with Google Wave Notifier, you can now get the same alerts when you receive a new Wave message. Head over to the Google Wave Notifier site (link below), and click the download link to get started.  Make sure to download the latest Binary zip, as this one will contain the Windows program rather than the source code. Unzip the folder, and then run GoogleWaveNotifier.exe. On first run, you can enter your Google Account information.  Notice that this is not a standard account login window; you’ll need to enter your email address in the Username field, and then your password below it. You can also change other settings from this dialog, including update frequency and whether or not to run at startup.  Click the value, and then select the setting you want from the dropdown menu. Now, you’ll have a new Wave icon in your system tray.  When it detects new Waves or unread updates, it will display a popup notification with details about the unread Waves.  Additionally, the icon will change to show the number of unread Waves.  Click the popup to open Wave in your browser.  Or, if you have Waver installed, simply open the Waver window to view your latest Waves. If you ever need to change settings again in the future, right-click the icon and select Settings, and then edit as above. Get Wave Notifications in Your Email  Most of us have Outlook or Gmail open all day, and seldom leave the house without a Smartphone with push email.  And thanks to a new Wave feature, you can still keep up with your Waves without having to change your workflow. To activate email notifications from Google Wave, login to your Wave account, click the arrow beside your Inbox, and select Notifications. Select how quickly you want to receive notifications, and choose which email address you wish to receive the notifications.  Click Save when you’re finished. Now you’ll receive an email with information about new and updated Waves in your account.  If there were only small changes, you may get enough info directly in the email; otherwise, you can click the link and open that Wave in your browser. Conclusion Google Wave has great potential as a collaboration and communications platform, but by default it can be hard to keep up with what’s going on in your Waves.  These apps for Windows help you integrate Wave with your workflow, and can keep you from constantly logging in and checking for new Waves.  And since Google Wave registration is now open for everyone, it’s a great time to give it a try and see how it works for yourself. Links Signup for Google Wave (Google Account required) Download Waver from the Adobe Air Marketplace Download Google Wave Notifier Similar Articles Productive Geek Tips We Have 20 Google Wave Invites. Want One?Tired of Waiting for Google Wave? Try ShareFlow NowIntegrate Google Docs with Outlook the Easy WayAwesome Desktop Wallpapers: The Windows 7 EditionWeek in Geek: The Stupid Geek Tricks to Hide Extra Windows Edition TouchFreeze Alternative in AutoHotkey The Icy Undertow Desktop Windows Home Server – Backup to LAN The Clear & Clean Desktop Use This Bookmarklet to Easily Get Albums Use AutoHotkey to Assign a Hotkey to a Specific Window Latest Software Reviews Tinyhacker Random Tips HippoRemote Pro 2.2 Xobni Plus for Outlook All My Movies 5.9 CloudBerry Online Backup 1.5 for Windows Home Server Default Programs Editor – One great tool for Setting Defaults Convert BMP, TIFF, PCX to Vector files with RasterVect Free Identify Fonts using WhatFontis.com Windows 7’s WordPad is Actually Good Greate Image Viewing and Management with Zoner Photo Studio Free Windows Media Player Plus! – Cool WMP Enhancer

    Read the article

  • Girl's Day 2012 in Potsdam

    - by jessica.ebbelaar(at)oracle.com
    Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin-top:0cm; mso-para-margin-right:0cm; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0cm; line-height:115%; mso-pagination:widow-orphan; font-family:"Times New Roman","serif"; mso-fareast-font-family:"Times New Roman";} Every year in April Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin-top:0cm; mso-para-margin-right:0cm; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0cm; line-height:115%; mso-pagination:widow-orphan; font-family:"Times New Roman","serif"; mso-fareast-font-family:"Times New Roman";} , technical enterprises and other organisations are invited to organise an open day for girls – called Girl´s Day. It has become a tradition for Oracle for more than 6 Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin-top:0cm; mso-para-margin-right:0cm; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0cm; line-height:115%; mso-pagination:widow-orphan; font-family:"Times New Roman","serif"; mso-fareast-font-family:"Times New Roman";} years, to participate in this special day and to encourage girls to discover technical work environments.   On the 26th of April 2012, 27 pupils aged 12 to 15 came to Oracle’s office in Potsdam in order to obtain interesting insights about Oracle´s business practices. An interactive Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin-top:0cm; mso-para-margin-right:0cm; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0cm; line-height:115%; mso-pagination:widow-orphan; font-family:"Times New Roman","serif"; mso-fareast-font-family:"Times New Roman";} four-hour program was specifically organized for all participants. At first, all pupils got to know Oracle as an enterprise with it’s different departments and it’s particular „business language“. What is hardware and software? Why do companies need a database? Questions as such were tailored and simply illustrated by 13 colleagues from the areas of Sales, Sales Consulting, Support and Recruitment.   Followed by a short introduction about career paths from our female colleagues and their respective departments, the girls decided, according to their interests, which business area they would like to get more insights from. Based on their decision the groups were set up and the girls than discovered the work places. This helped everyone to dive deep into the everyday work life, how the offices are structured and how communication with clients is done via web conferences. All girls were encouraged to take part in the conference together with their Oracle advisor. 12 o´clock – lunch time. Besides a well-prepared buffet Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin-top:0cm; mso-para-margin-right:0cm; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0cm; line-height:115%; mso-pagination:widow-orphan; font-family:"Times New Roman","serif"; mso-fareast-font-family:"Times New Roman";} , all girls had now the opportunity to get all open questions clarified or to ask questions they did not dare to ask in front of a big group. After the lunch break, Anja Raack from the Graduate Recruitment team presented more about recruitment topics and gave useful advice on how to write professional emails.   After a short group assignment, where all participants had to identify common mistakes done in an email, a quiz completed this special day. All 5 groups showed a lot of enthusiasm during this game but no one had to worry as every single participant was rewarded with a prize and certificate.   To sum it up, we were very proud to host the girls for half a day and were impressed by their dedication. Hopefully, sooner or later, we will see some of them coming back to Oracle – either for the next Girl´s Day or one of our entry level positions. This day has shown that everyone can start a challenging career within an exciting industry. What matters is dedication and commitment to strive for the best.  Do you want to find out more about our job opportunities? Follow us on http://campus.oracle.com.

    Read the article

  • A Good Developer is So Hard to Find

    - by James Michael Hare
    Let me start out by saying I want to damn the writers of the Toughest Developer Puzzle Ever – 2. It is eating every last shred of my free time! But as I've been churning through each puzzle and marvelling at the brain teasers and trivia within, I began to think about interviewing developers and why it seems to be so hard to find good ones.  The problem is, it seems like no matter how hard we try to find the perfect way to separate the chaff from the wheat, inevitably someone will get hired who falls far short of expectations or someone will get passed over for missing a piece of trivia or a tricky brain teaser that could have been an excellent team member.   In shops that are primarily software-producing businesses or other heavily IT-oriented businesses (Microsoft, Amazon, etc) there often exists a much tighter bond between HR and the hiring development staff because development is their life-blood. Unfortunately, many of us work in places where IT is viewed as a cost or just a means to an end. In these shops, too often, HR and development staff may work against each other due to differences in opinion as to what a good developer is or what one is worth.  It seems that if you ask two different people what makes a good developer, often you will get three different opinions.   With the exception of those shops that are purely development-centric (you guys have it much easier!), most other shops have management who have very little knowledge about the development process.  Their view can often be that development is simply a skill that one learns and then once aquired, that developer can produce widgets as good as the next like workers on an assembly-line floor.  On the other side, you have many developers that feel that software development is an art unto itself and that the ability to create the most pure design or know the most obscure of keywords or write the shortest-possible obfuscated piece of code is a good coder.  So is it a skill?  An Art?  Or something entirely in between?   Saying that software is merely a skill and one just needs to learn the syntax and tools would be akin to saying anyone who knows English and can use Word can write a 300 page book that is accurate, meaningful, and stays true to the point.  This just isn't so.  It takes more than mere skill to take words and form a sentence, join those sentences into paragraphs, and those paragraphs into a document.  I've interviewed candidates who could answer obscure syntax and keyword questions and once they were hired could not code effectively at all.  So development must be more than a skill.   But on the other end, we have art.  Is development an art?  Is our end result to produce art?  I can marvel at a piece of code -- see it as concise and beautiful -- and yet that code most perform some stated function with accuracy and efficiency and maintainability.  None of these three things have anything to do with art, per se.  Art is beauty for its own sake and is a wonderful thing.  But if you apply that same though to development it just doesn't hold.  I've had developers tell me that all that matters is the end result and how you code it is entirely part of the art and I couldn't disagree more.  Yes, the end result, the accuracy, is the prime criteria to be met.  But if code is not maintainable and efficient, it would be just as useless as a beautiful car that breaks down once a week or that gets 2 miles to the gallon.  Yes, it may work in that it moves you from point A to point B and is pretty as hell, but if it can't be maintained or is not efficient, it's not a good solution.  So development must be something less than art.   In the end, I think I feel like development is a matter of craftsmanship.  We use our tools and we use our skills and set about to construct something that satisfies a purpose and yet is also elegant and efficient.  There is skill involved, and there is an art, but really it boils down to being able to craft code.  Crafting code is far more than writing code.  Anyone can write code if they know the syntax, but so few people can actually craft code that solves a purpose and craft it well.  So this is what I want to find, I want to find code craftsman!  But how?   I used to ask coding-trivia questions a long time ago and many people still fall back on this.  The thought is that if you ask the candidate some piece of coding trivia and they know the answer it must follow that they can craft good code.  For example:   What C++ keyword can be applied to a class/struct field to allow it to be changed even from a const-instance of that class/struct?  (answer: mutable)   So what do we prove if a candidate can answer this?  Only that they know what mutable means.  One would hope that this would infer that they'd know how to use it, and more importantly when and if it should ever be used!  But it rarely does!  The problem with triva questions is that you will either: Approve a really good developer who knows what some obscure keyword is (good) Reject a really good developer who never needed to use that keyword or is too inexperienced to know how to use it (bad) Approve a really bad developer who googled "C++ Interview Questions" and studied like hell but can't craft (very bad) Many HR departments love these kind of tests because they are short and easy to defend if a legal issue arrises on hiring decisions.  After all it's easy to say a person wasn't hired because they scored 30 out of 100 on some trivia test.  But unfortunately, you've eliminated a large part of your potential developer pool and possibly hired a few duds.  There are times I've hired candidates who knew every trivia question I could throw out them and couldn't craft.  And then there are times I've interviewed candidates who failed all my trivia but who I took a chance on who were my best finds ever.    So if not trivia, then what?  Brain teasers?  The thought is, these type of questions measure the thinking power of a candidate.  The problem is, once again, you will either: Approve a good candidate who has never heard the problem and can solve it (good) Reject a good candidate who just happens not to see the "catch" because they're nervous or it may be really obscure (bad) Approve a candidate who has studied enough interview brain teasers (once again, you can google em) to recognize the "catch" or knows the answer already (bad). Once again, you're eliminating good candidates and possibly accepting bad candidates.  In these cases, I think testing someone with brain teasers only tests their ability to answer brain teasers, not the ability to craft code. So how do we measure someone's ability to craft code?  Here's a novel idea: have them code!  Give them a computer and a compiler, or a whiteboard and a pen, or paper and pencil and have them construct a piece of code.  It just makes sense that if we're going to hire someone to code we should actually watch them code.  When they're done, we can judge them on several criteria: Correctness - does the candidate's solution accurately solve the problem proposed? Accuracy - is the candidate's solution reasonably syntactically correct? Efficiency - did the candidate write or use the more efficient data structures or algorithms for the job? Maintainability - was the candidate's code free of obfuscation and clever tricks that diminish readability? Persona - are they eager and willing or aloof and egotistical?  Will they work well within your team? It may sound simple, or it may sound crazy, but when I'm looking to hire a developer, I want to see them actually develop well-crafted code.

    Read the article

  • Using Open MQ as an Oracle CEP Event Source

    - by seth.white
    I helped an Oracle CEP customer recently who wanted to use Open MQ has an event source for their Oracle CEP application.  In this case, the Oracle CEP application was being used to provide monitoring for an electronic commerce website, however, the steps for configuring Open MQ are entirely independent of the application logic. I thought I would list the configuration steps in a blog post in case they might help others in the future. Note that although the Oracle CEP documentation states that only WebLogic and Tibco JMS are "officially" supported, any JMS implementation that provides a Java client should work with Oracle CEP. The first step is to add an adapter to the application's EPN. This can be done in the usual way, using the Eclipse IDE. The end result is something like the following bit of configuration in the application's Spring application context. Note that the provider attribute value of 'jms-inbound' specifies that the out-of-the-box JMS adapter is being used. <wlevs:adapter id="helloworldAdapter" provider="jms-inbound"> </wlevs:adapter>   Next, configure the inbound adapter so that it can connect to Open MQ in the Oracle CEP configuration file (config.xml). The snippet below provides an example of what this configuration should look like. The exact values specified for jndi-provider-url, jndi-factory, connection-jndi-name, destination-jndi-name elements will depend on your Open MQ configuration.  For example , if the name of your Open MQ topic destination is 'ElectronicCommerceTopic', then you would specify that as the destination-jndi-name.  The name of your Open MQ connection factory goes in the connection-jndi-name element. In my simple example, I also specify in event-type element so that the out-of-the-box JMS adapter will attempt to automatically convert incoming messages to events of type HelloWorldEvent. In a more complex application, one would configure a custom converter on the JMS adapter to convert from messages to events.  The Oracle CEP 11.1.3 documentation describes how to do this.   <jms-adapter> <name>helloworldAdapter</name> <event-type>HelloWorldEvent</event-type> <jndi-provider-url>file:///C:/Temp</jndi-provider-url> <jndi-factory>com.sun.jndi.fscontext.RefFSContextFactory</jndi-factory> <connection-jndi-name>YourJMSConnectionFactoryName</connection-jndi-name> <destination-jndi-name>YourJMSDestinationName</destination-jndi-name> </jms-adapter>   Finally, one needs to package the client-side Open MQ jars so that the classes that they contain are available to the Oracle CEP runtime. The recommended way for doing this in the Oracle CEP 11.1.3 release is to package the classes as a library module or simply place them in the application bundle.  The advantage of deploying the classes as a library module is that they are available to any application that wants to connect to Open MQ. In my case, I packaged the classes in my application bundle. A best practice when you want to include additional jars in your application bundle is to create a 'lib' directory in your Eclipse project and then copy the required jars into that directory.  Then, use the support that Eclipse provides to add the jars to the bundle classpath (which makes the classes part of your application in the same way that regular application classes are), and export all of the classes from your application bundle so that they are available to the Oracle CEP server runtime.  The screenshot below Illustrates how this is done in Eclipse.  The bundle classpath contains two Open MQ jars and all packages in the jars are exported.     Finally, import the javax.jms and javax.naming packages into the application module as these are needed by the Open MQ classes. The screenshot below shows the complete list of package imports for my sample application.       Once you have completed these steps, you should be able to build and deploy your application and begin receiving inbound messages from Open MQ. Technorati Tags: CEP,JMS,Adapter,Open MQ,Eclipse .csharpcode { background-color: #ffffff; font-family: consolas, "Courier New", courier, monospace; color: black; font-size: small } .csharpcode pre { background-color: #ffffff; font-family: consolas, "Courier New", courier, monospace; color: black; font-size: small } .csharpcode pre { margin: 0em } .csharpcode .rem { color: #008000 } .csharpcode .kwrd { color: #0000ff } .csharpcode .str { color: #006080 } .csharpcode .op { color: #0000c0 } .csharpcode .preproc { color: #cc6633 } .csharpcode .asp { background-color: #ffff00 } .csharpcode .html { color: #800000 } .csharpcode .attr { color: #ff0000 } .csharpcode .alt { background-color: #f4f4f4; margin: 0em; width: 100% } .csharpcode .lnum { color: #606060 } .csharpcode { background-color: #ffffff; font-family: consolas, "Courier New", courier, monospace; color: black; font-size: small } .csharpcode pre { background-color: #ffffff; font-family: consolas, "Courier New", courier, monospace; color: black; font-size: small } .csharpcode pre { margin: 0em } .csharpcode .rem { color: #008000 } .csharpcode .kwrd { color: #0000ff } .csharpcode .str { color: #006080 } .csharpcode .op { color: #0000c0 } .csharpcode .preproc { color: #cc6633 } .csharpcode .asp { background-color: #ffff00 } .csharpcode .html { color: #800000 } .csharpcode .attr { color: #ff0000 } .csharpcode .alt { background-color: #f4f4f4; margin: 0em; width: 100% } .csharpcode .lnum { color: #606060 } .csharpcode { background-color: #ffffff; font-family: consolas, "Courier New", courier, monospace; color: black; font-size: small } .csharpcode pre { background-color: #ffffff; font-family: consolas, "Courier New", courier, monospace; color: black; font-size: small } .csharpcode pre { margin: 0em } .csharpcode .rem { color: #008000 } .csharpcode .kwrd { color: #0000ff } .csharpcode .str { color: #006080 } .csharpcode .op { color: #0000c0 } .csharpcode .preproc { color: #cc6633 } .csharpcode .asp { background-color: #ffff00 } .csharpcode .html { color: #800000 } .csharpcode .attr { color: #ff0000 } .csharpcode .alt { background-color: #f4f4f4; margin: 0em; width: 100% } .csharpcode .lnum { color: #606060 }

    Read the article

  • How accurate is "Business logic should be in a service, not in a model"?

    - by Jeroen Vannevel
    Situation Earlier this evening I gave an answer to a question on StackOverflow. The question: Editing of an existing object should be done in repository layer or in service? For example if I have a User that has debt. I want to change his debt. Should I do it in UserRepository or in service for example BuyingService by getting an object, editing it and saving it ? My answer: You should leave the responsibility of mutating an object to that same object and use the repository to retrieve this object. Example situation: class User { private int debt; // debt in cents private string name; // getters public void makePayment(int cents){ debt -= cents; } } class UserRepository { public User GetUserByName(string name){ // Get appropriate user from database } } A comment I received: Business logic should really be in a service. Not in a model. What does the internet say? So, this got me searching since I've never really (consciously) used a service layer. I started reading up on the Service Layer pattern and the Unit Of Work pattern but so far I can't say I'm convinced a service layer has to be used. Take for example this article by Martin Fowler on the anti-pattern of an Anemic Domain Model: There are objects, many named after the nouns in the domain space, and these objects are connected with the rich relationships and structure that true domain models have. The catch comes when you look at the behavior, and you realize that there is hardly any behavior on these objects, making them little more than bags of getters and setters. Indeed often these models come with design rules that say that you are not to put any domain logic in the the domain objects. Instead there are a set of service objects which capture all the domain logic. These services live on top of the domain model and use the domain model for data. (...) The logic that should be in a domain object is domain logic - validations, calculations, business rules - whatever you like to call it. To me, this seemed exactly what the situation was about: I advocated the manipulation of an object's data by introducing methods inside that class that do just that. However I realize that this should be a given either way, and it probably has more to do with how these methods are invoked (using a repository). I also had the feeling that in that article (see below), a Service Layer is more considered as a façade that delegates work to the underlying model, than an actual work-intensive layer. Application Layer [his name for Service Layer]: Defines the jobs the software is supposed to do and directs the expressive domain objects to work out problems. The tasks this layer is responsible for are meaningful to the business or necessary for interaction with the application layers of other systems. This layer is kept thin. It does not contain business rules or knowledge, but only coordinates tasks and delegates work to collaborations of domain objects in the next layer down. It does not have state reflecting the business situation, but it can have state that reflects the progress of a task for the user or the program. Which is reinforced here: Service interfaces. Services expose a service interface to which all inbound messages are sent. You can think of a service interface as a façade that exposes the business logic implemented in the application (typically, logic in the business layer) to potential consumers. And here: The service layer should be devoid of any application or business logic and should focus primarily on a few concerns. It should wrap Business Layer calls, translate your Domain in a common language that your clients can understand, and handle the communication medium between server and requesting client. This is a serious contrast to other resources that talk about the Service Layer: The service layer should consist of classes with methods that are units of work with actions that belong in the same transaction. Or the second answer to a question I've already linked: At some point, your application will want some business logic. Also, you might want to validate the input to make sure that there isn't something evil or nonperforming being requested. This logic belongs in your service layer. "Solution"? Following the guidelines in this answer, I came up with the following approach that uses a Service Layer: class UserController : Controller { private UserService _userService; public UserController(UserService userService){ _userService = userService; } public ActionResult MakeHimPay(string username, int amount) { _userService.MakeHimPay(username, amount); return RedirectToAction("ShowUserOverview"); } public ActionResult ShowUserOverview() { return View(); } } class UserService { private IUserRepository _userRepository; public UserService(IUserRepository userRepository) { _userRepository = userRepository; } public void MakeHimPay(username, amount) { _userRepository.GetUserByName(username).makePayment(amount); } } class UserRepository { public User GetUserByName(string name){ // Get appropriate user from database } } class User { private int debt; // debt in cents private string name; // getters public void makePayment(int cents){ debt -= cents; } } Conclusion All together not much has changed here: code from the controller has moved to the service layer (which is a good thing, so there is an upside to this approach). However this doesn't look like it had anything to do with my original answer. I realize design patterns are guidelines, not rules set in stone to be implemented whenever possible. Yet I have not found a definitive explanation of the service layer and how it should be regarded. Is it a means to simply extract logic from the controller and put it inside a service instead? Is it supposed to form a contract between the controller and the domain? Should there be a layer between the domain and the service layer? And, last but not least: following the original comment Business logic should really be in a service. Not in a model. Is this correct? How would I introduce my business logic in a service instead of the model?

    Read the article

  • Clustering for Mere Mortals (Pt 3)

    - by Geoff N. Hiten
    The Controller Now we get to the meat of the matter.  You want a virtual cluster, the first thing you have to do is create your own portable domain.  Start with a plain vanilla install of Windows 2003 R2 Standard on a semi-default VM. (1 GB RAM, 2 cores, 2 NICs, 128GB dynamically expanding VHD file).  I chose this because it had the smallest disk and memory footprint of any current supported Microsoft Server product.  I created the VM with a single dynamically expanding VHD, one fixed 16 GB VHD, and two NICs.  One NIC is connected to the outside world and the other one is part of an internal-only network.  The first NIC is set up as a DHCP client.  We will get to the other one later. I actually tried this with Windows 2008 R2, but it failed miserably.  Not sure whether it was 2008 R2 or the fact I tried to use cloned VMs in the cluster.  Clustering is one place where NewSID would really come in handy.  Too bad Microsoft bought and buried it. Load and Patch the OS (hence the need for the outside connection).This is a good time to go get dinner.  Maybe a movie too.  There are close to a hundred patches that need to be downloaded and applied.  Avoiding that mess was why I put so much time into trying to get the 2008 R2 version working.  Maybe next time.  Don’t forget to add the extensions for VMLite (or whatever virtualization product you prefer). Set a fixed IP address on the internal-only NIC.  Do not give it a gateway.  Put the same IP address for the NIC and for the DNS Server.  This IP should be in a range that is never available on your public network.  You will need all the addresses in the range available.  See the previous post for the exact settings I used. I chose 10.97.230.1 as the server.  The rest of the 10.97.230 range is what I will use later.  For the curious, those numbers are based on elements of my home address.  Not truly random, but good enough for this project. Do not bridge the network connections.  I never allowed the cluster nodes direct access to any public network. Format the fixed VHD and leave it alone for now. Promote the VM to a Domain Controller.  If you have never done this, don’t worry.  The only meaningful decision is what to call the new domain.  I prefer a bogus name that does not correspond to a real Top-Level Domain (TLD).  .com, .biz., .net, .org  are all TLDs that we know and love.  I chose .test as the TLD since it is descriptive AND it does not exist in the real world.  The domain is called MicroAD.  This gives me MicroAD.Test as my domain. During the promotion process, you will be prompted to install DNS as part of the Domain creation process.  You want to accept this option.  The installer will automatically assign this DNS server as the authoritative owner of the MicroAD.test DNS domain (not to be confused with the MicroAD.test Active Directory domain.) For the rest of the DCPROMO process, just accept the defaults. Now let’s make our IP address management easy.  Add the DHCP Role to the server.  Add the server (10.97.230.1 in this case) as the default gateway to assign to DHCP clients.  Here is where you have to be VERY careful and bind it ONLY to the Internal NIC.  Trust me, your network admin will NOT like an extra DHCP server “helping” out on her network.  Go ahead and create a range of 10-20 IP Addresses in your scope.  You might find other uses for a pocket domain controller <cough> Mirroring </cough> than just for building a cluster.  And Clustering in SQL 2008 and Windows 2008 R2 fully supports DHCP addresses. Now we have three of the five key roles ready.  Two more to go. Next comes file sharing.  Since your cluster node VMs will not have access to any outside, you have to have some way to get files into these VMs.  I simply go to the root of C: and create a “Shared” folder.  I then share it out and grant full control to “Everyone” to both the share and to the underlying NTFS folder.   This will be immensely useful for Service Packs, demo databases, and any other software that isn’t packaged as an ISO that we can mount to the VM. Finally we need to create a block-level multi-connect storage device.  The kind folks at Starwinds Software (http://www.starwindsoftware.com/) graciously gave me a non-expiring demo license for expressly this purpose.  Their iSCSI SAN software lets you create an iSCSI target from nearly any storage medium.  Refreshingly, their product does exactly what they say it does.  Thanks. Remember that 16 GB VHD file?  That is where we are going to carve into our LUNs.  I created an iSCSI folder off the root, just so I can keep everything organized.  I then carved 5 ea. 2 GB iSCSI targets from that folder.  I chose a fixed VHD for performance.  I tried this earlier with a dynamically expanding VHD, but too many layers of abstraction and sparseness combined to make it unusable even for a demo.  Stick with a fixed VHD so there is a one-to-one mapping between abstract and physical storage.  If you read the previous post, you know what I named these iSCSI LUNs and why.  Yes, I do have some left over space.  Always leave yourself room for future growth or options. This gets us up to where we can actually build the nodes and install SQL.  As with most clusters, the real work happens long before the individual nodes get installed and configured.  At least it does if you want the cluster to be a true high-availability platform.

    Read the article

  • Cloud Computing = Elasticity * Availability

    - by Herve Roggero
    What is cloud computing? Is hosting the same thing as cloud computing? Are you running a cloud if you already use virtual machines? What is the difference between Infrastructure as a Service (IaaS) and a cloud provider? And the list goes on… these questions keep coming up and all try to fundamentally explain what “cloud” means relative to other concepts. At the risk of over simplification, answering these questions becomes simpler once you understand the primary foundations of cloud computing: Elasticity and Availability.   Elasticity The basic value proposition of cloud computing is to pay as you go, and to pay for what you use. This implies that an application can expand and contract on demand, across all its tiers (presentation layer, services, database, security…).  This also implies that application components can grow independently from each other. So if you need more storage for your database, you should be able to grow that tier without affecting, reconfiguring or changing the other tiers. Basically, cloud applications behave like a sponge; when you add water to a sponge, it grows in size; in the application world, the more customers you add, the more it grows. Pure IaaS providers will provide certain benefits, specifically in terms of operating costs, but an IaaS provider will not help you in making your applications elastic; neither will Virtual Machines. The smallest elasticity unit of an IaaS provider and a Virtual Machine environment is a server (physical or virtual). While adding servers in a datacenter helps in achieving scale, it is hardly enough. The application has yet to use this hardware.  If the process of adding computing resources is not transparent to the application, the application is not elastic.   As you can see from the above description, designing for the cloud is not about more servers; it is about designing an application for elasticity regardless of the underlying server farm.   Availability The fact of the matter is that making applications highly available is hard. It requires highly specialized tools and trained staff. On top of it, it's expensive. Many companies are required to run multiple data centers due to high availability requirements. In some organizations, some data centers are simply on standby, waiting to be used in a case of a failover. Other organizations are able to achieve a certain level of success with active/active data centers, in which all available data centers serve incoming user requests. While achieving high availability for services is relatively simple, establishing a highly available database farm is far more complex. In fact it is so complex that many companies establish yearly tests to validate failover procedures.   To a certain degree certain IaaS provides can assist with complex disaster recovery planning and setting up data centers that can achieve successful failover. However the burden is still on the corporation to manage and maintain such an environment, including regular hardware and software upgrades. Cloud computing on the other hand removes most of the disaster recovery requirements by hiding many of the underlying complexities.   Cloud Providers A cloud provider is an infrastructure provider offering additional tools to achieve application elasticity and availability that are not usually available on-premise. For example Microsoft Azure provides a simple configuration screen that makes it possible to run 1 or 100 web sites by clicking a button or two on a screen (simplifying provisioning), and soon SQL Azure will offer Data Federation to allow database sharding (which allows you to scale the database tier seamlessly and automatically). Other cloud providers offer certain features that are not available on-premise as well, such as the Amazon SC3 (Simple Storage Service) which gives you virtually unlimited storage capabilities for simple data stores, which is somewhat equivalent to the Microsoft Azure Table offering (offering a server-independent data storage model). Unlike IaaS providers, cloud providers give you the necessary tools to adopt elasticity as part of your application architecture.    Some cloud providers offer built-in high availability that get you out of the business of configuring clustered solutions, or running multiple data centers. Some cloud providers will give you more control (which puts some of that burden back on the customers' shoulder) and others will tend to make high availability totally transparent. For example, SQL Azure provides high availability automatically which would be very difficult to achieve (and very costly) on premise.   Keep in mind that each cloud provider has its strengths and weaknesses; some are better at achieving transparent scalability and server independence than others.    Not for Everyone Note however that it is up to you to leverage the elasticity capabilities of a cloud provider, as discussed previously; if you build a website that does not need to scale, for which elasticity is not important, then you can use a traditional host provider unless you also need high availability. Leveraging the technologies of cloud providers can be difficult and can become a journey for companies that build their solutions in a scale up fashion. Cloud computing promises to address cost containment and scalability of applications with built-in high availability. If your application does not need to scale or you do not need high availability, then cloud computing may not be for you. In fact, you may pay a premium to run your applications with cloud providers due to the underlying technologies built specifically for scalability and availability requirements. And as such, the cloud is not for everyone.   Consistent Customer Experience, Predictable Cost With all its complexities, buzz and foggy definition, cloud computing boils down to a simple objective: consistent customer experience at a predictable cost.  The objective of a cloud solution is to provide the same user experience to your last customer than the first, while keeping your operating costs directly proportional to the number of customers you have. Making your applications elastic and highly available across all its tiers, with as much automation as possible, achieves the first objective of a consistent customer experience. And the ability to expand and contract the infrastructure footprint of your application dynamically achieves the cost containment objectives.     Herve Roggero is a SQL Azure MVP and co-author of Pro SQL Azure (APress).  He is the co-founder of Blue Syntax Consulting (www.bluesyntax.net), a company focusing on cloud computing technologies helping customers understand and adopt cloud computing technologies. For more information contact herve at hroggero @ bluesyntax.net .

    Read the article

  • Movement prediction for non-shooters

    - by ShadowChaser
    I'm working on an isometric 2D game with moderate-scale multiplayer, approximately 20-30 players connected at once to a persistent server. I've had some difficulty getting a good movement prediction implementation in place. Physics/Movement The game doesn't have a true physics implementation, but uses the basic principles to implement movement. Rather than continually polling input, state changes (ie/ mouse down/up/move events) are used to change the state of the character entity the player is controlling. The player's direction (ie/ north-east) is combined with a constant speed and turned into a true 3D vector - the entity's velocity. In the main game loop, "Update" is called before "Draw". The update logic triggers a "physics update task" that tracks all entities with a non-zero velocity uses very basic integration to change the entities position. For example: entity.Position += entity.Velocity.Scale(ElapsedTime.Seconds) (where "Seconds" is a floating point value, but the same approach would work for millisecond integer values). The key point is that no interpolation is used for movement - the rudimentary physics engine has no concept of a "previous state" or "current state", only a position and velocity. State Change and Update Packets When the velocity of the character entity the player is controlling changes, a "move avatar" packet is sent to the server containing the entity's action type (stand, walk, run), direction (north-east), and current position. This is different from how 3D first person games work. In a 3D game the velocity (direction) can change frame to frame as the player moves around. Sending every state change would effectively transmit a packet per frame, which would be too expensive. Instead, 3D games seem to ignore state changes and send "state update" packets on a fixed interval - say, every 80-150ms. Since speed and direction updates occur much less frequently in my game, I can get away with sending every state change. Although all of the physics simulations occur at the same speed and are deterministic, latency is still an issue. For that reason, I send out routine position update packets (similar to a 3D game) but much less frequently - right now every 250ms, but I suspect with good prediction I can easily boost it towards 500ms. The biggest problem is that I've now deviated from the norm - all other documentation, guides, and samples online send routine updates and interpolate between the two states. It seems incompatible with my architecture, and I need to come up with a better movement prediction algorithm that is closer to a (very basic) "networked physics" architecture. The server then receives the packet and determines the players speed from it's movement type based on a script (Is the player able to run? Get the player's running speed). Once it has the speed, it combines it with the direction to get a vector - the entity's velocity. Some cheat detection and basic validation occurs, and the entity on the server side is updated with the current velocity, direction, and position. Basic throttling is also performed to prevent players from flooding the server with movement requests. After updating its own entity, the server broadcasts an "avatar position update" packet to all other players within range. The position update packet is used to update the client side physics simulations (world state) of the remote clients and perform prediction and lag compensation. Prediction and Lag Compensation As mentioned above, clients are authoritative for their own position. Except in cases of cheating or anomalies, the client's avatar will never be repositioned by the server. No extrapolation ("move now and correct later") is required for the client's avatar - what the player sees is correct. However, some sort of extrapolation or interpolation is required for all remote entities that are moving. Some sort of prediction and/or lag-compensation is clearly required within the client's local simulation / physics engine. Problems I've been struggling with various algorithms, and have a number of questions and problems: Should I be extrapolating, interpolating, or both? My "gut feeling" is that I should be using pure extrapolation based on velocity. State change is received by the client, client computes a "predicted" velocity that compensates for lag, and the regular physics system does the rest. However, it feels at odds to all other sample code and articles - they all seem to store a number of states and perform interpolation without a physics engine. When a packet arrives, I've tried interpolating the packet's position with the packet's velocity over a fixed time period (say, 200ms). I then take the difference between the interpolated position and the current "error" position to compute a new vector and place that on the entity instead of the velocity that was sent. However, the assumption is that another packet will arrive in that time interval, and it's incredibly difficult to "guess" when the next packet will arrive - especially since they don't all arrive on fixed intervals (ie/ state changes as well). Is the concept fundamentally flawed, or is it correct but needs some fixes / adjustments? What happens when a remote player stops? I can immediately stop the entity, but it will be positioned in the "wrong" spot until it moves again. If I estimate a vector or try to interpolate, I have an issue because I don't store the previous state - the physics engine has no way to say "you need to stop after you reach position X". It simply understands a velocity, nothing more complex. I'm reluctant to add the "packet movement state" information to the entities or physics engine, since it violates basic design principles and bleeds network code across the rest of the game engine. What should happen when entities collide? There are three scenarios - the controlling player collides locally, two entities collide on the server during a position update, or a remote entity update collides on the local client. In all cases I'm uncertain how to handle the collision - aside from cheating, both states are "correct" but at different time periods. In the case of a remote entity it doesn't make sense to draw it walking through a wall, so I perform collision detection on the local client and cause it to "stop". Based on point #2 above, I might compute a "corrected vector" that continually tries to move the entity "through the wall" which will never succeed - the remote avatar is stuck there until the error gets too high and it "snaps" into position. How do games work around this?

    Read the article

  • More Fun With Math

    - by PointsToShare
    More Fun with Math   The runaway student – three different ways of solving one problem Here is a problem I read in a Russian site: A student is running away. He is moving at 1 mph. Pursuing him are a lion, a tiger and his math teacher. The lion is 40 miles behind and moving at 6 mph. The tiger is 28 miles behind and moving at 4 mph. His math teacher is 30 miles behind and moving at 5 mph. Who will catch him first? Analysis Obviously we have a set of three problems. They are all basically the same, but the details are different. The problems are of the same class. Here is a little excursion into computer science. One of the things we strive to do is to create solutions for classes of problems rather than individual problems. In your daily routine, you call it re-usability. Not all classes of problems have such solutions. If a class has a general (re-usable) solution, it is called computable. Otherwise it is unsolvable. Within unsolvable classes, we may still solve individual (some but not all) problems, albeit with different approaches to each. Luckily the vast majority of our daily problems are computable, and the 3 problems of our runaway student belong to a computable class. So, let’s solve for the catch-up time by the math teacher, after all she is the most frightening. She might even make the poor runaway solve this very problem – perish the thought! Method 1 – numerical analysis. At 30 miles and 5 mph, it’ll take her 6 hours to come to where the student was to begin with. But by then the student has advanced by 6 miles. 6 miles require 6/5 hours, but by then the student advanced by another 6/5 of a mile as well. And so on and so forth. So what are we to do? One way is to write code and iterate it until we have solved it. But this is an infinite process so we’ll end up with an infinite loop. So what to do? We’ll use the principles of numerical analysis. Any calculator – your computer included – has a limited number of digits. A double floating point number is good for about 14 digits. Nothing can be computed at a greater accuracy than that. This means that we will not iterate ad infinidum, but rather to the point where 2 consecutive iterations yield the same result. When we do financial computations, we don’t even have to go that far. We stop at the 10th of a penny.  It behooves us here to stop at a 10th of a second (100 milliseconds) and this will how we will avoid an infinite loop. Interestingly this alludes to the Zeno paradoxes of motion – in particular “Achilles and the Tortoise”. Zeno says exactly the same. To catch the tortoise, Achilles must always first come to where the tortoise was, but the tortoise keeps moving – hence Achilles will never catch the tortoise and our math teacher (or lion, or tiger) will never catch the student, or the policeman the thief. Here is my resolution to the paradox. The distance and time in each step are smaller and smaller, so the student will be caught. The only thing that is infinite is the iterative solution. The race is a convergent geometric process so the steps are diminishing, but each step in the solution takes the same amount of effort and time so with an infinite number of steps, we’ll spend an eternity solving it.  This BTW is an original thought that I have never seen before. But I digress. Let’s simply write the code to solve the problem. To make sure that it runs everywhere, I’ll do it in JavaScript. function LongCatchUpTime(D, PV, FV) // D is Distance; PV is Pursuers Velocity; FV is Fugitive’ Velocity {     var t = 0;     var T = 0;     var d = parseFloat(D);     var pv = parseFloat (PV);     var fv = parseFloat (FV);     t = d / pv;     while (t > 0.000001) //a 10th of a second is 1/36,000 of an hour, I used 1/100,000     {         T = T + t;         d = t * fv;         t = d / pv;     }     return T;     } By and large, the higher the Pursuer’s velocity relative to the fugitive, the faster the calculation. Solving this with the 10th of a second limit yields: 7.499999232000001 Method 2 – Geometric Series. Each step in the iteration above is smaller than the next. As you saw, we stopped iterating when the last step was small enough, small enough not to really matter.  When we have a sequence of numbers in which the ratio of each number to its predecessor is fixed we call the sequence geometric. When we are looking at the sum of sequence, we call the sequence of sums series.  Now let’s look at our student and teacher. The teacher runs 5 times faster than the student, so with each iteration the distance between them shrinks to a fifth of what it was before. This is a fixed ratio so we deal with a geometric series.  We normally designate this ratio as q and when q is less than 1 (0 < q < 1) the sum of  + … +  is  – 1) / (q – 1). When q is less than 1, it is easier to use ) / (1 - q). Now, the steps are 6 hours then 6/5 hours then 6/5*5 and so on, so q = 1/5. And the whole series is multiplied by 6. Also because q is less than 1 , 1/  diminishes to 0. So the sum is just  / (1 - q). or 1/ (1 – 1/5) = 1 / (4/5) = 5/4. This times 6 yields 7.5 hours. We can now continue with some algebra and take it back to a simpler formula. This is arduous and I am not going to do it here. Instead let’s do some simpler algebra. Method 3 – Simple Algebra. If the time to capture the fugitive is T and the fugitive travels at 1 mph, then by the time the pursuer catches him he travelled additional T miles. Time is distance divided by speed, so…. (D + T)/V = T  thus D + T = VT  and D = VT – T = (V – 1)T  and T = D/(V – 1) This “strangely” coincides with the solution we just got from the geometric sequence. This is simpler ad faster. Here is the corresponding code. function ShortCatchUpTime(D, PV, FV) {     var d = parseFloat(D);     var pv = parseFloat (PV);     var fv = parseFloat (FV);     return d / (pv - fv); } The code above, for both the iterative solution and the algebraic solution are actually for a larger class of problems.  In our original problem the student’s velocity (speed) is 1 mph. In the code it may be anything as long as it is less than the pursuer’s velocity. As long as PV > FV, the pursuer will catch up. Here is the really general formula: T = D / (PV – FV) Finally, let’s run the program for each of the pursuers.  It could not be worse. I know he’d rather be eaten alive than suffering through yet another math lesson. See the code run? Select  “Catch Up Time” in www.mgsltns.com/games.htm The host is running on Unix, so the link is case sensitive. That’s All Folks

    Read the article

  • More Great Improvements to the Windows Azure Management Portal

    - by ScottGu
    Over the last 3 weeks we’ve released a number of enhancements to the new Windows Azure Management Portal.  These new capabilities include: Localization Support for 6 languages Operation Log Support Support for SQL Database Metrics Virtual Machine Enhancements (quick create Windows + Linux VMs) Web Site Enhancements (support for creating sites in all regions, private github repo deployment) Cloud Service Improvements (deploy from storage account, configuration support of dedicated cache) Media Service Enhancements (upload, encode, publish, stream all from within the portal) Virtual Networking Usability Enhancements Custom CNAME support with Storage Accounts All of these improvements are now live in production and available to start using immediately.  Below are more details on them: Localization Support The Windows Azure Portal now supports 6 languages – English, German, Spanish, French, Italian and Japanese. You can easily switch between languages by clicking on the Avatar bar on the top right corner of the Portal: Selecting a different language will automatically refresh the UI within the portal in the selected language: Operation Log Support The Windows Azure Portal now supports the ability for administrators to review the “operation logs” of the services they manage – making it easy to see exactly what management operations were performed on them.  You can query for these by selecting the “Settings” tab within the Portal and then choosing the “Operation Logs” tab within it.  This displays a filter UI that enables you to query for operations by date and time: As of the most recent release we now show logs for all operations performed on Cloud Services and Storage Accounts.  You can click on any operation in the list and click the “Details” button in the command bar to retrieve detailed status about it.  This now makes it possible to retrieve details about every management operation performed. In future updates you’ll see us extend the operation log capability to apply to all Windows Azure Services – which will enable great post-mortem and audit support. Support for SQL Database Metrics You can now monitor the number of successful connections, failed connections and deadlocks in your SQL databases using the new “Dashboard” view provided on each SQL Database resource: Additionally, if the database is added as a “linked resource” to a Web Site or Cloud Service, monitoring metrics for the linked SQL database are shown along with the Web Site or Cloud Service metrics in the dashboard. This helps with viewing and managing aggregated information across both resources in your application. Enhancements to Virtual Machines The most recent Windows Azure Portal release brings with it some nice usability improvements to Virtual Machines: Integrated Quick Create experience for Windows and Linux VMs Creating a new Windows or Linux VM is now easy using the new “Quick Create” experience in the Portal: In addition to Windows VM templates you can also now select Linux image templates in the quick create UI: This makes it incredibly easy to create a new Virtual Machine in only a few seconds. Enhancements to Web Sites Prior to this past month’s release, users were forced to choose a single geographical region when creating their first site.  After that, subsequent sites could only be created in that same region.  This restriction has now been removed, and you can now create sites in any region at any time and have up to 10 free sites in each supported region: One of the new regions we’ve recently opened up is the “East Asia” region.  This allows you to now deploy sites to North America, Europe and Asia simultaneously.  Private GitHub Repository Support This past week we also enabled Git based continuous deployment support for Web Sites from private GitHub and BitBucket repositories (previous to this you could only enable this with public repositories).  Enhancements to Cloud Services Experience The most recent Windows Azure Portal release brings with it some nice usability improvements to Cloud Services: Deploy a Cloud Service from a Windows Azure Storage Account The Windows Azure Portal now supports deploying an application package and configuration file stored in a blob container in Windows Azure Storage. The ability to upload an application package from storage is available when you custom create, or upload to, or update a cloud service deployment. To upload an application package and configuration, create a Cloud Service, then select the file upload dialog, and choose to upload from a Windows Azure Storage Account: To upload an application package from storage, click the “FROM STORAGE” button and select the application package and configuration file to use from the new blob storage explorer in the portal. Configure Windows Azure Caching in a caching enabled cloud service If you have deployed the new dedicated cache within a cloud service role, you can also now configure the cache settings in the portal by navigating to the configuration tab of for your Cloud Service deployment. The configuration experience is similar to the one in Visual Studio when you create a cloud service and add a caching role.  The portal now allows you to add or remove named caches and change the settings for the named caches – all from within the Portal and without needing to redeploy your application. Enhancements to Media Services You can now upload, encode, publish, and play your video content directly from within the Windows Azure Portal.  This makes it incredibly easy to get started with Windows Azure Media Services and perform common tasks without having to write any code. Simply navigate to your media service and then click on the “Content” tab.  All of the media content within your media service account will be listed here: Clicking the “upload” button within the portal now allows you to upload a media file directly from your computer: This will cause the video file you chose from your local file-system to be uploaded into Windows Azure.  Once uploaded, you can select the file within the content tab of the Portal and click the “Encode” button to transcode it into different streaming formats: The portal includes a number of pre-set encoding formats that you can easily convert media content into: Once you select an encoding and click the ok button, Windows Azure Media Services will kick off an encoding job that will happen in the cloud (no need for you to stand-up or configure a custom encoding server).  When it’s finished, you can select the video in the “Content” tab and then click PUBLISH in the command bar to setup an origin streaming end-point to it: Once the media file is published you can point apps against the public URL and play the content using Windows Azure Media Services – no need to setup or run your own streaming server.  You can also now select the file and click the “Play” button in the command bar to play it using the streaming endpoint directly within the Portal: This makes it incredibly easy to try out and use Windows Azure Media Services and test out an end-to-end workflow without having to write any code.  Once you test things out you can of course automate it using script or code – providing you with an incredibly powerful Cloud Media platform that you can use. Enhancements to Virtual Network Experience Over the last few months, we have received feedback on the complexity of the Virtual Network creation experience. With these most recent Portal updates, we have added a Quick Create experience that makes the creation experience very simple. All that an administrator now needs to do is to provide a VNET name, choose an address space and the size of the VNET address space. They no longer need to understand the intricacies of the CIDR format or walk through a 4-page wizard or create a VNET / subnet. This makes creating virtual networks really simple: The portal also now has a “Register DNS Server” task that makes it easy to register DNS servers and associate them with a virtual network. Enhancements to Storage Experience The portal now lets you register custom domain names for your Windows Azure Storage Accounts.  To enable this, select a storage resource and then go to the CONFIGURE tab for a storage account, and then click MANAGE DOMAIN on the command bar: Clicking “Manage Domain” will bring up a dialog that allows you to register any CNAME you want: Summary The above features are all now live in production and available to use immediately.  If you don’t already have a Windows Azure account, you can sign-up for a free trial and start using them today.  Visit the Windows Azure Developer Center to learn more about how to build apps with it. One of the other cool features that is now live within the portal is our new Windows Azure Store – which makes it incredibly easy to try and purchase developer services from a variety of partners.  It is an incredibly awesome new capability – and something I’ll be doing a dedicated post about shortly. Hope this helps, Scott P.S. In addition to blogging, I am also now using Twitter for quick updates and to share links. Follow me at: twitter.com/scottgu

    Read the article

  • The Internet of Things Is Really the Internet of People

    - by HCM-Oracle
    By Mark Hurd - Originally Posted on LinkedIn As I speak with CEOs around the world, our conversations invariably come down to this central question: Can we change our corporate cultures and the ways we train and reward our people as rapidly as new technology is changing the work we do, the products we make and how we engage with customers? It’s a critical consideration given today’s pace of disruption, which already is straining traditional management models and HR strategies. Winning companies will bring innovation and vision to their employees and partners by attracting people who will thrive in this emerging world of relentless data, predictive analytics and unlimited what-if scenarios. So, where are we going to find employees who are as familiar with complex data as I am with orderly financial statements and business plans? I’m not just talking about high-end data scientists who most certainly will sit at or near the top of the new decision-making pyramid. Global organizations will need creative and motivated people who will devote their time to manipulating, reviewing, analyzing, sorting and reshaping data to drive business and delight customers. This might seem evident, but my conversations with business people across the globe indicate that only a small number of companies get it. In the past few years, executives have been busy keeping pace with seismic upheavals, including the rise of social customer engagement, the rapid acceleration of product-development cycles and the relentless move to mobile-first. But all of that, I think, is the start of an uphill climb to the top of a roller-coaster. Today, about 10 billion devices across the globe are connected to the Internet. In a couple of years, that number will probably double, and not because we will have bought 10 billion more computers, smart phones and tablets. This unprecedented explosion of Big Data is being triggered by the Internet of Things, which is another way of saying that the numerous intelligent devices touching our everyday lives are all becoming interconnected. Home appliances, food, industrial equipment, pets, pharmaceutical products, pallets, cars, luggage, packaged goods, athletic equipment, even clothing will be streaming data. Some data will provide important information about how to run our businesses and lead healthier lives. Much of it will be extraneous. How does a CEO cope with this unimaginable volume and velocity of data, much less harness it to excite and delight customers? Here are three things CEOs must do to tackle this challenge: 1) Take care of your employees, take care of your customers. Larry Ellison recently noted that the two most important priorities for any CEO today revolve around people: Taking care of your employees and taking care of your customers. Companies in today’s hypercompetitive business environment simply won’t be able to survive unless they’ve got world-class people at all levels of the organization. CEOs must demonstrate a commitment to employees by becoming champions for HR systems that empower every employee to fully understand his or her job, how it ties into the corporate framework, what’s expected of them, what training is available, and how they can use an embedded social network to communicate, collaborate and excel. Over the next several years, many of the world’s top industrialized economies will see a turnover in the workforce on an unprecedented scale. Across the United States, Europe, China and Japan, the “baby boomer” generation will be retiring and, by 2020, we’ll see turnovers in those regions ranging from 10 to 30 percent. How will companies replace all that brainpower, experience and know-how? How will CEOs perpetuate the best elements of their corporate cultures in the midst of this profound turnover? The challenge will be daunting, but it can be met with world-class HR technology. As companies begin replacing up to 30 percent of their workforce, they will need thousands of new types of data-native workers to exploit the Internet of Things in the service of the Internet of People. The shift in corporate mindset here can’t be overstated. The CEO has to be at the forefront of this new way of recruiting, training, motivating, aligning and developing truly 21-century talent. 2) Start thinking today about the Internet of People. Some forward-looking companies have begun pursuing the “democratization of data.” This allows more people within a company greater access to data that can help them make better decisions, move more quickly and keep pace with the changing interests and demands of their customers. As a result, we’ve seen organizations flatten out, growing numbers of well-informed people authorized to make decisions without corporate approval and a movement of engagement away from headquarters to the point of contact with the customer. These are profound changes, and I’m a huge proponent. As I think about what the next few years will bring as companies become deluged with unprecedented streams of data, I’m convinced that we’ll need dramatically different organizational structures, decision-making models, risk-management profiles and reward systems. For example, if a car company’s marketing department mines incoming data to determine that customers are shifting rapidly toward neon-green models, how many layers of approval, review, analysis and sign-off will be needed before the factory starts cranking out more neon-green cars? Will we continue to have organizations where too many people are empowered to say “No” and too few are allowed to say “Yes”? If so, how will those companies be able to compete in a world in which customers have more choices, instant access to more information and less loyalty than ever before? That’s why I think CEOs need to begin thinking about this problem right now, not in a year or two when competitors are already reshaping their organizations to match the marketplace’s new realities. 3) Partner with universities to help create a new type of highly skilled workers. Several years ago, universities introduced new undergraduate as well as graduate-level programs in analytics and informatics as the business need for deeper insights into the booming world of data began to explode. Today, as the growth rate of data continues to soar, we know that the Internet of Things will only intensify that growth. Moreover, as Big Data fuels insights that can be shaped into products and services that generate revenue, the demand for data scientists and data specialists will go on unabated. Beyond that top-level expertise, companies are going to need data-native thinkers at all levels of the organization. Where will this new type of worker come from? I think it’s incumbent on the business community to collaborate with universities to develop new curricula designed to turn out graduates who can capitalize on the data-driven world that the Internet of Things is surely going to create. These new workers will create opportunities to help their companies in fields as diverse as product design, customer service, marketing, manufacturing and distribution. They will become innovative leaders in fashioning an entirely new type of workforce and organizational structure optimized to fully exploit the Internet of Things so that it becomes a high-value enabler of the Internet of People. Mark Hurd is President of Oracle Corporation and a member of the company's Board of Directors. He joined Oracle in 2010, bringing more than 30 years of technology industry leadership, computer hardware expertise, and executive management experience to his role with the company. As President, Mr. Hurd oversees the corporate direction and strategy for Oracle's global field operations, including marketing, sales, consulting, alliances and channels, and support. He focuses on strategy, leadership, innovation, and customers.

    Read the article

  • T4 Performance Counters explained

    - by user13346607
    Now that T4 is out for a few month some people might have wondered what details of the new pipeline you can monitor. A "cpustat -h" lists a lot of events that can be monitored, and only very few are self-explanatory. I will try to give some insight on all of them, some of these "PIC events" require an in-depth knowledge of T4 pipeline. Over time I will try to explain these, for the time being these events should simply be ignored. (Side note: some counters changed from tape-out 1.1 (*only* used in the T4 beta program) to tape-out 1.2 (used in the systems shipping today) The table only lists the tape-out 1.2 counters) 0 0 1 1058 6033 Oracle Microelectronics 50 14 7077 14.0 Normal 0 false false false EN-US JA X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-parent:""; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin:0cm; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:12.0pt; font-family:Cambria; mso-ascii-font-family:Cambria; mso-ascii-theme-font:minor-latin; mso-hansi-font-family:Cambria; mso-hansi-theme-font:minor-latin;} pic name (cpustat) Prose Comment Sel-pipe-drain-cycles, Sel-0-[wait|ready], Sel-[1,2] Sel-0-wait counts cycles a strand waits to be selected. Some reasons can be counted in detail; these are: Sel-0-ready: Cycles a strand was ready but not selected, that can signal pipeline oversubscription Sel-1: Cycles only one instruction or µop was selected Sel-2: Cycles two instructions or µops were selected Sel-pipe-drain-cycles: cf. PRM footnote 8 to table 10.2 Pick-any, Pick-[0|1|2|3] Cycles one, two, three, no or at least one instruction or µop is picked Instr_FGU_crypto Number of FGU or crypto instructions executed on that vcpu Instr_ld dto. for load Instr_st dto. for store SPR_ring_ops dto. for SPR ring ops Instr_other dto. for all other instructions not listed above, PRM footnote 7 to table 10.2 lists the instructions Instr_all total number of instructions executed on that vcpu Sw_count_intr Nr of S/W count instructions on that vcpu (sethi %hi(fc000),%g0 (whatever that is))  Atomics nr of atomic ops, which are LDSTUB/a, CASA/XA, and SWAP/A SW_prefetch Nr of PREFETCH or PREFETCHA instructions Block_ld_st Block loads or store on that vcpu IC_miss_nospec, IC_miss_[L2_or_L3|local|remote]\ _hit_nospec Various I$ misses, distinguished by where they hit. All of these count per thread, but only primary events: T4 counts only the first occurence of an I$ miss on a core for a certain instruction. If one strand misses in I$ this miss is counted, but if a second strand on the same core misses while the first miss is being resolved, that second miss is not counted This flavour of I$ misses counts only misses that are caused by instruction that really commit (note the "_nospec") BTC_miss Branch target cache miss ITLB_miss ITLB misses (synchronously counted) ITLB_miss_asynch dto. but asynchronously [I|D]TLB_fill_\ [8KB|64KB|4MB|256MB|2GB|trap] H/W tablewalk events that fill ITLB or DTLB with translation for the corresponding page size. The “_trap” event occurs if the HWTW was not able to fill the corresponding TLB IC_mtag_miss, IC_mtag_miss_\ [ptag_hit|ptag_miss|\ ptag_hit_way_mismatch] I$ micro tag misses, with some options for drill down Fetch-0, Fetch-0-all fetch-0 counts nr of cycles nothing was fetched for this particular strand, fetch-0-all counts cycles nothing was fetched for all strands on a core Instr_buffer_full Cycles the instruction buffer for a strand was full, thereby preventing any fetch BTC_targ_incorrect Counts all occurences of wrongly predicted branch targets from the BTC [PQ|ROB|LB|ROB_LB|SB|\ ROB_SB|LB_SB|RB_LB_SB|\ DTLB_miss]\ _tag_wait ST_q_tag_wait is listed under sl=20. These counters monitor pipeline behaviour therefore they are not strand specific: PQ_...: cycles Rename stage waits for a Pick Queue tag (might signal memory bound workload for single thread mode, cf. Mail from Richard Smith) ROB_...: cycles Select stage waits for a ROB (ReOrderBuffer) tag LB_...: cycles Select stage waits for a Load Buffer tag SB_...: cycles Select stage waits for Store Buffer tag combinations of the above are allowed, although some of these events can overlap, the counter will only be incremented once per cycle if any of these occur DTLB_...: cycles load or store instructions wait at Pick stage for a DTLB miss tag [ID]TLB_HWTW_\ [L2_hit|L3_hit|L3_miss|all] Counters for HWTW accesses caused by either DTLB or ITLB misses. Canbe further detailed by where they hit IC_miss_L2_L3_hit, IC_miss_local_remote_remL3_hit, IC_miss I$ prefetches that were dropped because they either miss in L2$ or L3$ This variant counts misses regardless if the causing instruction commits or not DC_miss_nospec, DC_miss_[L2_L3|local|remote_L3]\ _hit_nospec D$ misses either in general or detailed by where they hit cf. the explanation for the IC_miss in two flavours for an explanation of _nospec and the reasoning for two DC_miss counters DTLB_miss_asynch counts all DTLB misses asynchronously, there is no way to count them synchronously DC_pref_drop_DC_hit, SW_pref_drop_[DC_hit|buffer_full] L1-D$ h/w prefetches that were dropped because of a D$ hit, counted per core. The others count software prefetches per strand [Full|Partial]_RAW_hit_st_[buf|q] Count events where a load wants to get data that has not yet been stored, i. e. it is still inside the pipeline. The data might be either still in the store buffer or in the store queue. If the load's data matches in the SB and in the store queue the data in buffer takes precedence of course since it is younger [IC|DC]_evict_invalid, [IC|DC|L1]_snoop_invalid, [IC|DC|L1]_invalid_all Counter for invalidated cache evictions per core St_q_tag_wait Number of cycles pipeline waits for a store queue tag, of course counted per core Data_pref_[drop_L2|drop_L3|\ hit_L2|hit_L3|\ hit_local|hit_remote] Data prefetches that can be further detailed by either why they were dropped or where they did hit St_hit_[L2|L3], St_L2_[local|remote]_C2C, St_local, St_remote Store events distinguished by where they hit or where they cause a L2 cache-to-cache transfer, i.e. either a transfer from another L2$ on the same die or from a different die DC_miss, DC_miss_\ [L2_L3|local|remote]_hit D$ misses either in general or detailed by where they hit cf. the explanation for the IC_miss in two flavours for an explanation of _nospec and the reasoning for two DC_miss counters L2_[clean|dirty]_evict Per core clean or dirty L2$ evictions L2_fill_buf_full, L2_wb_buf_full, L2_miss_buf_full Per core L2$ buffer events, all count number of cycles that this state was present L2_pipe_stall Per core cycles pipeline stalled because of L2$ Branches Count branches (Tcc, DONE, RETRY, and SIT are not counted as branches) Br_taken Counts taken branches (Tcc, DONE, RETRY, and SIT are not counted as branches) Br_mispred, Br_dir_mispred, Br_trg_mispred, Br_trg_mispred_\ [far_tbl|indir_tbl|ret_stk] Counter for various branch misprediction events.  Cycles_user counts cycles, attribute setting hpriv, nouser, sys controls addess space to count in Commit-[0|1|2], Commit-0-all, Commit-1-or-2 Number of times either no, one, or two µops commit for a strand. Commit-0-all counts number of times no µop commits for the whole core, cf. footnote 11 to table 10.2 in PRM for a more detailed explanation on how this counters interacts with the privilege levels

    Read the article

  • Clustering for Mere Mortals (Pt3)

    - by Geoff N. Hiten
    The Controller Now we get to the meat of the matter.  You want a virtual cluster, the first thing you have to do is create your own portable domain.  IStart with a plain vanilla install of Windows 2003 R2 Standard on a semi-default VM. (1 GB RAM, 2 cores, 2 NICs, 128GB dynamically expanding VHD file).  I chose this because it had the smallest disk and memory footprint of any current supported Microsoft Server product.  I created the VM with a single dynamically expanding VHD, one fixed 16 GB VHD, and two NICs.  One NIC is connected to the outside world and the other one is part of an internal-only network.  The first NIC is set up as a DHCP client.  We will get to the other one later. I actually tried this with Windows 2008 R2, but it failed miserably.  Not sure whether it was 2008 R2 or the fact I tried to use cloned VMs in the cluster.  Clustering is one place where NewSID would really come in handy.  Too bad Microsoft bought and buried it. Load and Patch the OS (hence the need for the outside connection).This is a good time to go get dinner.  Maybe a movie too.  There are close to a hundred patches that need to be downloaded and applied.  Avoiding that mess was why I put so much time into trying to get the 2008 R2 version working.  Maybe next time.  Don’t forget to add the extensions for VMLite (or whatever virtualization product you prefer). Set a fixed IP address on the internal-only NIC.  Do not give it a gateway.  Put the same IP address for the NIC and for the DNS Server.  This IP should be in a range that is never available on your public network.  You will need all the addresses in the range available.  See the previous post for the exact settings I used. I chose 10.97.230.1 as the server.  The rest of the 10.97.230 range is what I will use later.  For the curious, those numbers are based on elements of my home address.  Not truly random, but good enough for this project. Do not bridge the network connections.  I never allowed the cluster nodes direct access to any public network. Format the fixed VHD and leave it alone for now. Promote the VM to a Domain Controller.  If you have never done this, don’t worry.  The only meaningful decision is what to call the new domain.  I prefer a bogus name that does not correspond to a real Top-Level Domain (TLD).  .com, .biz., .net, .org  are all TLDs that we know and love.  I chose .test as the TLD since it is descriptive AND it does not exist in the real world.  The domain is called MicroAD.  This gives me MicroAD.Test as my domain. During the promotion process, you will be prompted to install DNS as part of the Domain creation process.  You want to accept this option.  The installer will automatically assign this DNS server as the authoritative owner of the MicroAD.test DNS domain (not to be confused with the MicroAD.test Active Directory domain.) For the rest of the DCPROMO process, just accept the defaults. Now let’s make our IP address management easy.  Add the DHCP Role to the server.  Add the server (10.97.230.1 in this case) as the default gateway to assign to DHCP clients.  Here is where you have to be VERY careful and bind it ONLY to the Internal NIC.  Trust me, your network admin will NOT like an extra DHCP server “helping” out on her network.  Go ahead and create a range of 10-20 IP Addresses in your scope.  You might find other uses for a pocket domain controller <cough> Mirroring </cough> than just for building a cluster.  And Clustering in SQL 2008 and Windows 2008 R2 fully supports DHCP addresses. Now we have three of the five key roles ready.  Two more to go. Next comes file sharing.  Since your cluster node VMs will not have access to any outside, you have to have some way to get files into these VMs.  I simply go to the root of C: and create a “Shared” folder.  I then share it out and grant full control to “Everyone” to both the share and to the underlying NTFS folder.   This will be immensely useful for Service Packs, demo databases, and any other software that isn’t packaged as an ISO that we can mount to the VM. Finally we need to create a block-level multi-connect storage device.  The kind folks at Starwinds Software (http://www.starwindsoftware.com/) graciously gave me a non-expiring demo license for expressly this purpose.  Their iSCSI SAN software lets you create an iSCSI target from nearly any storage medium.  Refreshingly, their product does exactly what they say it does.  Thanks. Remember that 16 GB VHD file?  That is where we are going to carve into our LUNs.  I created an iSCSI folder off the root, just so I can keep everything organized.  I then carved 5 ea. 2 GB iSCSI targets from that folder.  I chose a fixed VHD for performance.  I tried this earlier with a dynamically expanding VHD, but too many layers of abstraction and sparseness combined to make it unusable even for a demo.  Stick with a fixed VHD so there is a one-to-one mapping between abstract and physical storage.  If you read the previous post, you know what I named these iSCSI LUNs and why.  Yes, I do have some left over space.  Always leave yourself room for future growth or options. This gets us up to where we can actually build the nodes and install SQL.  As with most clusters, the real work happens long before the individual nodes get installed and configured.  At least it does if you want the cluster to be a true high-availability platform.

    Read the article

  • Evaluating Solutions to Manage Product Compliance? Don't Wait Much Longer

    - by Kerrie Foy
    Depending on severity, product compliance issues can cause all sorts of problems from run-away budgets to business closures. But effective policies and safeguards can create a strong foundation for innovation, productivity, market penetration and competitive advantage. If you’ve been putting off a systematic approach to product compliance, it is time to reconsider that decision, or indecision. Why now?  No matter what industry, companies face a litany of worldwide and regional regulations that require proof of product compliance and environmental friendliness for market access.  For example, Restriction of Hazardous Substances (RoHS) is a regulation that restricts the use of six dangerous materials used in the manufacture of electronic and electrical equipment.  ROHS was originally adopted by the European Union in 2003 for implementation in 2006, and it has evolved over time through various regional versions for North America, China, Japan, Korea, Norway and Turkey.  In addition, the RoHS directive allowed for material exemptions used in Medical Devices, but that exemption ends in 2014.   Additional regulations worth watching are the Battery Directive, Waste Electrical and Electronic Equipment (WEEE), and Registration, Evaluation, Authorization and Restriction of Chemicals (REACH) directives.  Additional evolving regulations are coming from governing bodies like the Food and Drug Administration (FDA) and the International Organization for Standardization (ISO). Corporate sustainability initiatives are also gaining urgency and influencing product design. In a survey of 405 corporations in the Global 500 by Carbon Disclosure Project, co-written by PwC (CDP Global 500 Climate Change Report 2012 entitled Business Resilience in an Uncertain, Resource-Constrained World), 48% of the respondents indicated they saw potential to create new products and business services as a response to climate change. Just 21% reported a dedicated budget for the research. However, the report goes on to explain that those few companies are winning over new customers and driving additional profits by exploiting their abilities to adapt to environmental needs. The article cites Dell as an example – Dell has invested in research to develop new products designed to reduce its customers’ emissions by more than 10 million metric tons of CO2e per year. This reduction in emissions should save Dell’s customers over $1billion per year as a result! Over time we expect to see many additional companies prove that eco-design provides marketplace benefits through differentiation and direct customer value. How do you meet compliance requirements and also successfully invest in eco-friendly designs? No doubt companies struggle to answer this question. After all, the journey to get there may involve transforming business models, go-to-market strategies, supply networks, quality assurance policies and compliance processes per the rapidly evolving global and regional directives. There may be limited executive focus on the initiative, inability to quantify noncompliance, or not enough resources to justify investment. To make things even more difficult to address, compliance responsibility can be a passionate topic within an organization, making the prospect of change on an enterprise scale problematic and time-consuming. Without a single source of truth for product data and without proper processes in place, ensuring product compliance burgeons into a crushing task that is cost-prohibitive and overwhelming to an organization. With all the overhead, certain markets or demographics become simply inaccessible. Therefore, the risk to consumer goodwill and satisfaction, revenue, business continuity, and market potential is too great not to solve the compliance challenge. Companies are beginning to adapt and even thrive in today’s highly regulated and transparent environment by implementing systematic approaches to product compliance that are more than functional bandages but revenue-generating engines. Consider partnering with Oracle to help you address your compliance needs. Many of the world’s most innovative leaders and pioneers are leveraging Oracle’s Agile Product Lifecycle Management (PLM) portfolio of enterprise applications to manage the product value chain, centralize product data, automate processes, and launch more eco-friendly products to market faster.   Particularly, the Agile Product Governance & Compliance (PG&C) solution provides out-of-the-box functionality to integrate actionable regulatory information into the enterprise product record from the ideation to the disposal/recycling phase. Agile PG&C makes it possible to efficiently manage compliance per corporate green initiatives as well as regional and global directives. Options are critical, but so is ease-of-use. Anyone who’s grappled with compliance policy knows legal interpretation plays a major role in determining how an organization responds to regulation. Agile PG&C gives you the freedom to configure product compliance per your needs, while maintaining rigorous control over the product record in an easy-to-use interface that facilitates adoption efforts. It allows you to assign regulations as specifications for a part or BOM roll-up. Each specification has a threshold value that alerts you to a non-compliance issue if the threshold value is exceeded. Set however many regulations as specifications you need to make sure a product can be sold in your target countries. Another option is to implement like one of our leading consumer electronics customers and define your own “catch-all” specification to ensure compliance in all markets. You can give your suppliers secure access to enter their component data or integrate a third party’s data. With Agile PG&C you are able to design compliance earlier into your products to reduce cost and improve quality downstream when stakes are higher. Agile PG&C is a comprehensive solution that makes product compliance more reliable and efficient. Throughout product lifecycles, use the solution to support full material disclosures, efficiently manage declarations with your suppliers, feed compliance data into a corrective action if a product must be changed, and swiftly satisfy audits by showing all due diligence tracked in one solution. Given the compounding regulation and consumer focus on urgent environmental issues, now is the time to act. Implementing an enterprise, systematic approach to product compliance is a competitive investment. From the start, Agile Product Governance & Compliance enables companies to confidently design for compliance and sustainability, reduce the cost of compliance, minimize the risk of business interruption, deliver responsible products, and inspire new innovation.  Don’t wait any longer! To find out more about Agile Product Governance & Compliance download the data sheet, contact your sales representative, or call Oracle at 1-800-633-0738. Many thanks to Shane Goodwin, Senior Manager, Oracle Agile PLM Product Management, for contributions to this article. 

    Read the article

< Previous Page | 488 489 490 491 492 493 494 495 496 497 498 499  | Next Page >