Search Results

Search found 65101 results on 2605 pages for 'big data'.

Page 23/2605 | < Previous Page | 19 20 21 22 23 24 25 26 27 28 29 30  | Next Page >

  • Oracle Enterprise Data Quality Adds Global Address Verification Capabilities for Greater Accuracy and Broader Location Coverage

    - by Mala Narasimharajan
    Data quality – has many flavors to it.  Product, Customer – you name the data domain and there’s data quality associated with it.  Address verification and data quality are a little different.  in that there is a tremendous amount of variation as well as nuance attached to it.  Specifically, what makes address verification challenging is that more often than not, addresses are incomplete, riddled with misspellings, incorrect postal codes are assigned to locations or non-address items are present.  Almost all data has locations, and accurate locations power a wealth of business processes: Customer Relationship Management, data quality, delivery of materials, goods or services, fraud detection, insurance risk assessment, data analytics, store and territory planning, and much more. Oracle Address Verification Server provides location-based services as well as deeper parsing and analysis capabilities for Oracle Enterprise Data Quality.  Specifically, Pre-integrated with the EDQ platform, Oracle Address Verification Server provides robust parsing, validation, as well as specialized location information for over 240 countries – all populated countries on Earth.  Oracle Enterprise Data Quality (EDQ) is a data quality platform, dedicated to address the distinct challenges of customer and product data quality, and performs advanced data profiling to identify and measure poor quality data and identify rule requirements, as well as semantic and pattern-based recognition to accurately parse and standardize data that is poorly structured.   EDQ is integrated with Oracle Master Data Management, including Oracle Customer Hub and Oracle Product Hub, as well as Oracle Data Integrator Enterprise Edition and Oracle CRM.  Address Verification Server provides key address verification services for Oracle CRM and Oracle Customer Hub.  In addition, Address Verification Server provides greater accuracy when handling address data due to its expanded sources and extensible knowledge repository, solid parsing across locales and countries as well as  adept handling of extraneous data in address fields.  For more information on Oracle Address Verification Server visit:  http://bit.ly/GMUE4H and http://bit.ly/GWf7U6

    Read the article

  • Globacom and mCentric Deploy BDA and NoSQL Database to analyze network traffic 40x faster

    - by Jean-Pierre Dijcks
    In a fast evolving market, speed is of the essence. mCentric and Globacom leveraged Big Data Appliance, Oracle NoSQL Database to save over 35,000 Call-Processing minutes daily and analyze network traffic 40x faster.  Here are some highlights from the profile: Why Oracle “Oracle Big Data Appliance works well for very large amounts of structured and unstructured data. It is the most agile events-storage system for our collect-it-now and analyze-it-later set of business requirements. Moreover, choosing a prebuilt solution drastically reduced implementation time. We got the big data benefits without needing to assemble and tune a custom-built system, and without the hidden costs required to maintain a large number of servers in our data center. A single support license covers both the hardware and the integrated software, and we have one central point of contact for support,” said Sanjib Roy, CTO, Globacom. Implementation Process It took only five days for Oracle partner mCentric to deploy Oracle Big Data Appliance, perform the software install and configuration, certification, and resiliency testing. The entire process—from site planning to phase-I, go-live—was executed in just over ten weeks, well ahead of the four months allocated to complete the project. Oracle partner mCentric leveraged Oracle Advanced Customer Support Services’ implementation methodology to ensure configurations are tailored for peak performance, all patches are applied, and software and communications are consistently tested using proven methodologies and best practices. Read the entire profile here.

    Read the article

  • Making Spring Data JPA work with DataNucleus (GAE) (Spring Boot)

    - by xybrek
    There are several hints that Spring Data works with Google App Engine like: http://tommysiu.blogspot.com/2014/01/spring-data-on-gae-part-1.html http://blog.eisele.net/2009/07/spring-300m3-on-google-appengine-with.html Much of the examples are not "Spring Boot" so I've been trying to retrofit things with it. However, I've been stuck with this error for days and days: [INFO] Caused by: java.lang.NullPointerException [INFO] at org.datanucleus.api.jpa.metamodel.SingularAttributeImpl.isVersion(SingularAttributeImpl.java:79) [INFO] at org.springframework.data.jpa.repository.support.JpaMetamodelEntityInformation.findVersionAttribute(JpaMetamodelEntityInformation.java:102) [INFO] at org.springframework.data.jpa.repository.support.JpaMetamodelEntityInformation.<init>(JpaMetamodelEntityInformation.java:79) [INFO] at org.springframework.data.jpa.repository.support.JpaEntityInformationSupport.getMetadata(JpaEntityInformationSupport.java:65) [INFO] at org.springframework.data.jpa.repository.support.JpaRepositoryFactory.getEntityInformation(JpaRepositoryFactory.java:149) [INFO] at org.springframework.data.jpa.repository.support.JpaRepositoryFactory.getTargetRepository(JpaRepositoryFactory.java:88) [INFO] at org.springframework.data.jpa.repository.support.JpaRepositoryFactory.getTargetRepository(JpaRepositoryFactory.java:68) [INFO] at org.springframework.data.repository.core.support.RepositoryFactorySupport.getRepository(RepositoryFactorySupport.java:158) [INFO] at org.springframework.data.repository.core.support.RepositoryFactoryBeanSupport.initAndReturn(RepositoryFactoryBeanSupport.java:224) [INFO] at org.springframework.data.repository.core.support.RepositoryFactoryBeanSupport.afterPropertiesSet(RepositoryFactoryBeanSupport.java:210) [INFO] at org.springframework.data.jpa.repository.support.JpaRepositoryFactoryBean.afterPropertiesSet(JpaRepositoryFactoryBean.java:92) [INFO] at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory$6.run(AbstractAutowireCapableBeanFactory.java:1602) [INFO] at java.security.AccessController.doPrivileged(Native Method) [INFO] at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.invokeInitMethods(AbstractAutowireCapableBeanFactory.java:1599) [INFO] at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.initializeBean(AbstractAutowireCapableBeanFactory.java:1549) [INFO] ... 40 more Where, I'm trying to use Spring Data JPA with DataNucleus/AppEngine: @Configuration @ComponentScan @EnableJpaRepositories @EnableTransactionManagement class JpaApplicationConfig { private static final Logger logger = Logger .getLogger(JpaApplicationConfig.class.getName()); @Bean public EntityManagerFactory entityManagerFactory() { logger.info("Loading Entity Manager..."); return Persistence .createEntityManagerFactory("transactions-optional"); } @Bean public PlatformTransactionManager transactionManager() { logger.info("Loading Transaction Manager..."); final JpaTransactionManager txManager = new JpaTransactionManager(); txManager.setEntityManagerFactory(entityManagerFactory()); return txManager; } } I've tested Persistence.createEntityManagerFactory("transactions-optional"); to see if the app can persist using this EMF, well, it does, so I am sure that this EMF works fine. The problem is the "wiring" up with the Spring Data JPA, can anybody help?

    Read the article

  • Parse and read data frame in C?

    - by user253656
    I am writing a program that reads the data from the serial port on Linux. The data are sent by another device with the following frame format: |start | Command | Data | CRC | End | |0x02 | 0x41 | (0-127 octets) | | 0x03| ---------------------------------------------------- The Data field contains 127 octets as shown and octet 1,2 contains one type of data; octet 3,4 contains another data. I need to get these data I know how to write and read data to and from a serial port in Linux, but it is just to write and read a simple string (like "ABD") My issue is that I do not know how to parse the data frame formatted as above so that I can: get the data in octet 1,2 in the Data field get the data in octet 3,4 in the Data field get the value in CRC field to check the consistency of the data Here the sample snip code that read and write a simple string from and to a serial port in Linux: int writeport(int fd, char *chars) { int len = strlen(chars); chars[len] = 0x0d; // stick a <CR> after the command chars[len+1] = 0x00; // terminate the string properly int n = write(fd, chars, strlen(chars)); if (n < 0) { fputs("write failed!\n", stderr); return 0; } return 1; } int readport(int fd, char *result) { int iIn = read(fd, result, 254); result[iIn-1] = 0x00; if (iIn < 0) { if (errno == EAGAIN) { printf("SERIAL EAGAIN ERROR\n"); return 0; } else { printf("SERIAL read error %d %s\n", errno, strerror(errno)); return 0; } } return 1; } Does anyone please have some ideas? Thanks all.

    Read the article

  • Pre-filtering and shaping OData feeds using WCF Data Services and the Entity Framework - Part 1

    - by rajbk
    The Open Data Protocol, referred to as OData, is a new data-sharing standard that breaks down silos and fosters an interoperative ecosystem for data consumers (clients) and producers (services) that is far more powerful than currently possible. It enables more applications to make sense of a broader set of data, and helps every data service and client add value to the whole ecosystem. WCF Data Services (previously known as ADO.NET Data Services), then, was the first Microsoft technology to support the Open Data Protocol in Visual Studio 2008 SP1. It provides developers with client libraries for .NET, Silverlight, AJAX, PHP and Java. Microsoft now also supports OData in SQL Server 2008 R2, Windows Azure Storage, Excel 2010 (through PowerPivot), and SharePoint 2010. Many other other applications in the works. * This post walks you through how to create an OData feed, define a shape for the data and pre-filter the data using Visual Studio 2010, WCF Data Services and the Entity Framework. A sample project is attached at the bottom of Part 2 of this post. Pre-filtering and shaping OData feeds using WCF Data Services and the Entity Framework - Part 2 Create the Web Application File –› New –› Project, Select “ASP.NET Empty Web Application” Add the Entity Data Model Right click on the Web Application in the Solution Explorer and select “Add New Item..” Select “ADO.NET Entity Data Model” under "Data”. Name the Model “Northwind” and click “Add”.   In the “Choose Model Contents”, select “Generate Model From Database” and click “Next”   Define a connection to your database containing the Northwind database in the next screen. We are going to expose the Products table through our OData feed. Select “Products” in the “Choose your Database Object” screen.   Click “Finish”. We are done creating our Entity Data Model. Save the Northwind.edmx file created. Add the WCF Data Service Right click on the Web Application in the Solution Explorer and select “Add New Item..” Select “WCF Data Service” from the list and call the service “DataService” (creative, huh?). Click “Add”.   Enable Access to the Data Service Open the DataService.svc.cs class. The class is well commented and instructs us on the next steps. public class DataService : DataService< /* TODO: put your data source class name here */ > { // This method is called only once to initialize service-wide policies. public static void InitializeService(DataServiceConfiguration config) { // TODO: set rules to indicate which entity sets and service operations are visible, updatable, etc. // Examples: // config.SetEntitySetAccessRule("MyEntityset", EntitySetRights.AllRead); // config.SetServiceOperationAccessRule("MyServiceOperation", ServiceOperationRights.All); config.DataServiceBehavior.MaxProtocolVersion = DataServiceProtocolVersion.V2; } } Replace the comment that starts with “/* TODO:” with “NorthwindEntities” (the entity container name of the Model we created earlier).  WCF Data Services is initially locked down by default, FTW! No data is exposed without you explicitly setting it. You have explicitly specify which Entity sets you wish to expose and what rights are allowed by using the SetEntitySetAccessRule. The SetServiceOperationAccessRule on the other hand sets rules for a specified operation. Let us define an access rule to expose the Products Entity we created earlier. We use the EnititySetRights.AllRead since we want to give read only access. Our modified code is shown below. public class DataService : DataService<NorthwindEntities> { public static void InitializeService(DataServiceConfiguration config) { config.SetEntitySetAccessRule("Products", EntitySetRights.AllRead); config.DataServiceBehavior.MaxProtocolVersion = DataServiceProtocolVersion.V2; } } We are done setting up our ODataFeed! Compile your project. Right click on DataService.svc and select “View in Browser” to see the OData feed. To view the feed in IE, you must make sure that "Feed Reading View" is turned off. You set this under Tools -› Internet Options -› Content tab.   If you navigate to “Products”, you should see the Products feed. Note also that URIs are case sensitive. ie. Products work but products doesn’t.   Filtering our data OData has a set of system query operations you can use to perform common operations against data exposed by the model. For example, to see only Products in CategoryID 2, we can use the following request: /DataService.svc/Products?$filter=CategoryID eq 2 At the time of this writing, supported operations are $orderby, $top, $skip, $filter, $expand, $format†, $select, $inlinecount. Pre-filtering our data using Query Interceptors The Product feed currently returns all Products. We want to change that so that it contains only Products that have not been discontinued. WCF introduces the concept of interceptors which allows us to inject custom validation/policy logic into the request/response pipeline of a WCF data service. We will use a QueryInterceptor to pre-filter the data so that it returns only Products that are not discontinued. To create a QueryInterceptor, write a method that returns an Expression<Func<T, bool>> and mark it with the QueryInterceptor attribute as shown below. [QueryInterceptor("Products")] public Expression<Func<Product, bool>> OnReadProducts() { return o => o.Discontinued == false; } Viewing the feed after compilation will only show products that have not been discontinued. We also confirm this by looking at the WHERE clause in the SQL generated by the entity framework. SELECT [Extent1].[ProductID] AS [ProductID], ... ... [Extent1].[Discontinued] AS [Discontinued] FROM [dbo].[Products] AS [Extent1] WHERE 0 = [Extent1].[Discontinued] Other examples of Query/Change interceptors can be seen here including an example to filter data based on the identity of the authenticated user. We are done pre-filtering our data. In the next part of this post, we will see how to shape our data. Pre-filtering and shaping OData feeds using WCF Data Services and the Entity Framework - Part 2 Foot Notes * http://msdn.microsoft.com/en-us/data/aa937697.aspx † $format did not work for me. The way to get a Json response is to include the following in the  request header “Accept: application/json, text/javascript, */*” when making the request. This is easily done with most JavaScript libraries.

    Read the article

  • JavaScript Data Binding Frameworks

    - by dwahlin
    Data binding is where it’s at now days when it comes to building client-centric Web applications. Developers experienced with desktop frameworks like WPF or web frameworks like ASP.NET, Silverlight, or others are used to being able to take model objects containing data and bind them to UI controls quickly and easily. When moving to client-side Web development the data binding story hasn’t been great since neither HTML nor JavaScript natively support data binding. This means that you have to write code to place data in a control and write code to extract it. Although it’s certainly feasible to do it from scratch (many of us have done it this way for years), it’s definitely tedious and not exactly the best solution when it comes to maintenance and re-use. Over the last few years several different script libraries have been released to simply the process of binding data to HTML controls. In fact, the subject of data binding is becoming so popular that it seems like a new script library is being released nearly every week. Many of the libraries provide MVC/MVVM pattern support in client-side JavaScript apps and some even integrate directly with server frameworks like Node.js. Here’s a quick list of a few of the available libraries that support data binding (if you like any others please add a comment and I’ll try to keep the list updated): AngularJS MVC framework for data binding (although closely follows the MVVM pattern). Backbone.js MVC framework with support for models, key/value binding, custom events, and more. Derby Provides a real-time environment that runs in the browser an in Node.js. The library supports data binding and templates. Ember Provides support for templates that automatically update as data changes. JsViews Data binding framework that provides “interactive data-driven views built on top of JsRender templates”. jQXB Expression Binder Lightweight jQuery plugin that supports bi-directional data binding support. KnockoutJS MVVM framework with robust support for data binding. For an excellent look at using KnockoutJS check out John Papa’s course on Pluralsight. Meteor End to end framework that uses Node.js on the server and provides support for data binding on  the client. Simpli5 JavaScript framework that provides support for two-way data binding. WinRT with HTML5/JavaScript If you’re building Windows 8 applications using HTML5 and JavaScript there’s built-in support for data binding in the WinJS library.   I won’t have time to write about each of these frameworks, but in the next post I’m going to talk about my (current) favorite when it comes to client-side JavaScript data binding libraries which is AngularJS. AngularJS provides an extremely clean way – in my opinion - to extend HTML syntax to support data binding while keeping model objects (the objects that hold the data) free from custom framework method calls or other weirdness. While I’m writing up the next post, feel free to visit the AngularJS developer guide if you’d like additional details about the API and want to get started using it.

    Read the article

  • Protect Data and Save Money? Learn How Best-in-Class Organizations do Both

    - by roxana.bradescu
    Databases contain nearly two-thirds of the sensitive information that must be protected as part of any organization's overall approach to security, risk management, and compliance. Solutions for protecting data housed in databases vary from encrypting data at the application level to defense-in-depth protection of the database itself. So is there a difference? Absolutely! According to new research from the Aberdeen Group, Best-in-Class organizations experience fewer data breaches and audit deficiencies - at lower cost -- by deploying database security solutions. And the results are dramatic: Aberdeen found that organizations encrypting data within their databases achieved 30% fewer data breaches and 15% greater audit efficiency with 34% less total cost when compared to organizations encrypting data within applications. Join us for a live webcast with Derek Brink, Vice President and Research Fellow at the Aberdeen Group, next week to learn how your organization can become Best-in-Class.

    Read the article

  • Bad Data is Really the Monster

    - by Dain C. Hansen
    Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Calibri","sans-serif"; mso-bidi-font-family:"Times New Roman";} Bad Data is really the monster – is an article written by Bikram Sinha who I borrowed the title and the inspiration for this blog. Sinha writes: “Bad or missing data makes application systems fail when they process order-level data. One of the key items in the supply-chain industry is the product (aka SKU). Therefore, it becomes the most important data element to tie up multiple merchandising processes including purchase order allocation, stock movement, shipping notifications, and inventory details… Bad data can cause huge operational failures and cost millions of dollars in terms of time, resources, and money to clean up and validate data across multiple participating systems. Yes bad data really is the monster, so what do we do about it? Close our eyes and hope it stays in the closet? We’ve tacked this problem for some years now at Oracle, and with our latest introduction of Oracle Enterprise Data Quality along with our integrated Oracle Master Data Management products provides a complete, best-in-class answer to the bad data monster. What’s unique about it? Oracle Enterprise Data Quality also combines powerful data profiling, cleansing, matching, and monitoring capabilities while offering unparalleled ease of use. What makes it unique is that it has dedicated capabilities to address the distinct challenges of both customer and product data quality – [different monsters have different needs of course!]. And the ability to profile data is just as important to identify and measure poor quality data and identify new rules and requirements. Included are semantic and pattern-based recognition to accurately parse and standardize data that is poorly structured. Finally all of the data quality components are integrated with Oracle Master Data Management, including Oracle Customer Hub and Oracle Product Hub, as well as Oracle Data Integrator Enterprise Edition and Oracle CRM. Want to learn more? On Tuesday Nov 15th, I invite you to listen to our webcast on Reduce ERP consolidation risks with Oracle Master Data Management I’ll be joined by our partner iGate Patni and be talking about one specific way to deal with the bad data monster specifically around ERP consolidation. Look forward to seeing you there!

    Read the article

  • Where can I locate business data to use in my application?

    - by Aaron McIver
    This question talks about any and all free public raw data which appeared to have valuable pieces but nothing that really provides what I am looking for. Instead of using a socially defined listing of businesses (foursquare), I would like a business listing data set of registered businesses and associated addresses that could then be searchable based on location (coordinates). The critical need is that the data set should be filterable based on varying criteria (give me all restaurants, coffee shops, etc...). If the data is free that is great but anywhere that sells this type of data would also suffice. Infochimps looked like a possibility but perhaps something a bit more extensive exists. Where can I find a free or for fee data set of registered business that is filterable based on type of business and location?

    Read the article

  • Zend GData - seems really big

    - by musoNic80
    I've downloaded the Zend GData package to use the Google Calendar API. When I look through the contents of the package it seems to contain loads and loads of stuff. Do I really need all of it just for using Google Calendar and no other Google APIs? If not, what can i safely get rid of?

    Read the article

  • Sybase PowerDesigner Change Many (Find/Replace/Convert) Data Item's Data Types

    - by Andy
    Hello, I have a relatively large Conceptual Data Model in PowerDesigner. After generating a Physical Data Model and seeing the DBMS data types, I need to update all of data types(NUMBER/TEXT) for each data item. I'd like to either do a find/replace within the Conceptual Data Model or somehow map to different data types when creating the Physical Data Model. Ex. Change the auto conversion of Text - Clob, to Text - NVARCHAR(20). Thanks!

    Read the article

  • are there any useful datasets available on the web for data mining?

    - by niko
    Hi, Does anyone know any good resource where example (real) data can be downloaded for experimenting statistics and machine learning techniques such as decision trees etc? Currently I am studying machine learning techniques and it would be very helpful to have real data for evaluating the accuracy of various tools. If anyone knows any good resource (perhaps csv, xls files or any other format) I would be very thankful for a suggestion.

    Read the article

  • Splitting big request in multiple small ajax requests

    - by Ionut
    I am unsure regarding the scalability of the following model. I have no experience at all with large systems, big number of requests and so on but I'm trying to build some features considering scalability first. In my scenario there is a user page which contains data for: User's details (name, location, workplace ...) User's activity (blog posts, comments...) User statistics (rating, number of friends...) In order to show all this on the same page, for a request there will be at least 3 different database queries on the back-end. In some cases, I imagine that those queries will be running quite a wile, therefore the user experience may suffer while waiting between requests. This is why I decided to run only step 1 (User's details) as a normal request. After the response is received, two ajax requests are sent for steps 2 and 3. When those responses are received, I only place the data in the destined wrappers. For me at least this makes more sense. However there are 3 requests instead of one for every user page view. Will this affect the system on the long term? I'm assuming that this kind of approach requires more resources but is this trade of UX for resources a good dial or should I stick to one plain big request?

    Read the article

  • When you’re on a high, start something big

    - by BuckWoody
    Most days are pretty average – we have some highs, some lows, and just regular old work to do. But some days the sun is shining, your co-workers are especially nice, and everything just falls into place. You really *enjoy* what you do. Don’t let that moment pass. All of us have “big” projects that we need to tackle. Things that are going to take a long time, and a lot of money. Those kinds of data projects take a LOT of planning, and many times we put that off just to get to the day’s work. I’ve found that the “high” moments are the perfect time to take on these big projects. I’m more focused, and more importantly, more positive. And as the quote goes, “whether you think you can or you think you can’t, you’re probably right.” You’ll find a way to make it happen if you’re in a positive mood. Now – having those “great days” is actually something you can influence, but I’ll save that topic for a future post. I have a project to work on. :) Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

    Read the article

  • Know your Data Lineage

    - by Simon Elliston Ball
    An academic paper without the footnotes isn’t an academic paper. Journalists wouldn’t base a news article on facts that they can’t verify. So why would anyone publish reports without being able to say where the data has come from and be confident of its quality, in other words, without knowing its lineage. (sometimes referred to as ‘provenance’ or ‘pedigree’) The number and variety of data sources, both traditional and new, increases inexorably. Data comes clean or dirty, processed or raw, unimpeachable or entirely fabricated. On its journey to our report, from its source, the data can travel through a network of interconnected pipes, passing through numerous distinct systems, each managed by different people. At each point along the pipeline, it can be changed, filtered, aggregated and combined. When the data finally emerges, how can we be sure that it is right? How can we be certain that no part of the data collection was based on incorrect assumptions, that key data points haven’t been left out, or that the sources are good? Even when we’re using data science to give us an approximate or probable answer, we cannot have any confidence in the results without confidence in the data from which it came. You need to know what has been done to your data, where it came from, and who is responsible for each stage of the analysis. This information represents your data lineage; it is your stack-trace. If you’re an analyst, suspicious of a number, it tells you why the number is there and how it got there. If you’re a developer, working on a pipeline, it provides the context you need to track down the bug. If you’re a manager, or an auditor, it lets you know the right things are being done. Lineage tracking is part of good data governance. Most audit and lineage systems require you to buy into their whole structure. If you are using Hadoop for your data storage and processing, then tools like Falcon allow you to track lineage, as long as you are using Falcon to write and run the pipeline. It can mean learning a new way of running your jobs (or using some sort of proxy), and even a distinct way of writing your queries. Other Hadoop tools provide a lot of operational and audit information, spread throughout the many logs produced by Hive, Sqoop, MapReduce and all the various moving parts that make up the eco-system. To get a full picture of what’s going on in your Hadoop system you need to capture both Falcon lineage and the data-exhaust of other tools that Falcon can’t orchestrate. However, the problem is bigger even that that. Often, Hadoop is just one piece in a larger processing workflow. The next step of the challenge is how you bind together the lineage metadata describing what happened before and after Hadoop, where ‘after’ could be  a data analysis environment like R, an application, or even directly into an end-user tool such as Tableau or Excel. One possibility is to push as much as you can of your key analytics into Hadoop, but would you give up the power, and familiarity of your existing tools in return for a reliable way of tracking lineage? Lineage and auditing should work consistently, automatically and quietly, allowing users to access their data with any tool they require to use. The real solution, therefore, is to create a consistent method by which to bring lineage data from these data various disparate sources into the data analysis platform that you use, rather than being forced to use the tool that manages the pipeline for the lineage and a different tool for the data analysis. The key is to keep your logs, keep your audit data, from every source, bring them together and use the data analysis tools to trace the paths from raw data to the answer that data analysis provides.

    Read the article

  • Google I/O 2012 - Crunching Big Data with BigQuery

    Google I/O 2012 - Crunching Big Data with BigQuery Jordan Tigani, Ryan Boyd Google BigQuery is a data analysis tool born from Google internal technologies. It enables developers to analyze terabyte data sets in seconds using a RESTful API. This session will dive into best practices for getting fast answers to business questions. We'll provide insight into how we process queries under the hood and how to construct SQL queries for complex analysis. For all I/O 2012 sessions, go to developers.google.com From: GoogleDevelopers Views: 1 0 ratings Time: 01:03:04 More in Science & Technology

    Read the article

  • NSURLConnection receives data even if no data was thrown back

    - by Anna Fortuna
    Let me explain my situation. Currently, I am experimenting long-polling using NSURLConnection. I found this and I decided to try it. What I do is send a request to the server with a timeout interval of 300 secs. (or 5 mins.) Here is a code snippet: NSURL *url = [NSURL URLWithString:urlString]; NSURLRequest *request = [NSURLRequest requestWithURL:url cachePolicy:NSURLCacheStorageAllowedInMemoryOnly timeoutInterval:300]; NSData *data = [NSURLConnection sendSynchronousRequest:request returningResponse:&resp error:&err]; Now I want to test if the connection will "hold" the request if no data was thrown back from the server, so what I did was this: if (data != nil) [self performSelectorOnMainThread:@selector(dataReceived:) withObject:data waitUntilDone:YES]; And the function dataReceived: looks like this: - (void)dataReceived:(NSData *)data { NSLog(@"DATA RECEIVED!"); NSString *string = [NSString stringWithUTF8String:[data bytes]]; NSLog(@"THE DATA: %@", string); } Server-side, I created a function that will return a data once it fits the arguments and returns none if nothing fits. Here is a snippet of the PHP function: function retrieveMessages($vardata) { if (!empty($vardata)) { $result = check_data($vardata) //check_data is the function which returns 1 if $vardata //fits the arguments, and 0 if it fails to fit if ($result == 1) { $jsonArray = array('Data' => $vardata); echo json_encode($jsonArray); } } } As you can see, the function will only return data if the $result is equal to 1. However, even if the function returns nothing, NSURLConnection will still perform the function dataReceived: meaning the NSURLConnection still receives data, albeit an empty one. So can anyone help me here? How will I perform long-polling using NSURLConnection? Basically, I want to maintain the connection as long as no data is returned. So how will I do it? NOTE: I am new to PHP, so if my code is wrong, please point it out so I can correct it.

    Read the article

  • How to maintain an ordered table with Core Data (or SQL) with insertions/deletions?

    - by Jean-Denis Muys
    This question is in the context of Core Data, but if I am not mistaken, it applies equally well to a more general SQL case. I want to maintain an ordered table using Core Data, with the possibility for the user to: reorder rows insert new lines anywhere delete any existing line What's the best data model to do that? I can see two ways: 1) Model it as an array: I add an int position property to my entity 2) Model it as a linked list: I add two one-to-one relations, next and previous from my entity to itself 1) makes it easy to sort, but painful to insert or delete as you then have to update the position of all objects that come after 2) makes it easy to insert or delete, but very difficult to sort. In fact, I don't think I know how to express a Sort Descriptor (SQL ORDER BY clause) for that case. Now I can imagine a variation on 1): 3) add an int ordering property to the entity, but instead of having it count one-by-one, have it count 100 by 100 (for example). Then inserting is as simple as finding any number between the ordering of the previous and next existing objects. The expensive renumbering only has to occur when the 100 holes have been filled. Making that property a float rather than an int makes it even better: it's almost always possible to find a new float midway between two floats. Am I on the right track with solution 3), or is there something smarter?

    Read the article

  • Creating a unique id for a Core Data program on the iPhone

    - by Big Twisty
    I am fairly new at programming in Objective-C, having been a windows programmer since I was a kid. I am falling in LOVE with it. I am, however, having a bit of trouble figuring out this Core Data stuff. How do I create a new entry with a unique ID? In SQL I would just declare one field as an autoincrement field. I'm not seeing anything like that here, but I could just be missing something. I just want an auto incrementing NSInteger field, so when I manually add items to the database, I will have some form of reference to them.

    Read the article

  • How can I scrape specific data from a website

    - by Stoney
    I'm trying to scrape data from a website for research. The urls are nicely organized in an example.com/x format, with x as an ascending number and all of the pages are structured in the same way. I just need to grab certain headings and a few numbers which are always in the same locations. I'll then need to get this data into structured form for analysis in Excel. I have used wget before to download pages, but I can't figure out how to grab specific lines of text. Excel has a feature to grab data from the web (Data-From Web) but from what I can see it only allows me to download tables. Unfortunately, the data I need is not in tables.

    Read the article

  • How should I architect my Model and Data Access layer objects in my website?

    - by Robin Winslow
    I've been tasked with designing Data layer for a website at work, and I am very interested in architecture of code for the best flexibility, maintainability and readability. I am generally acutely aware of the value in completely separating out my actual Models from the Data Access layer, so that the Models are completely naive when it comes to Data Access. And in this case it's particularly useful to do this as the Models may be built from the Database or may be built from a Soap web service. So it seems to me to make sense to have Factories in my data access layer which create Model objects. So here's what I have so far (in my made-up pseudocode): class DataAccess.ProductsFromXml extends DataAccess.ProductFactory {} class DataAccess.ProductsFromDatabase extends DataAccess.ProductFactory {} These then get used in the controller in a fashion similar to the following: var xmlProductCreator = DataAccess.ProductsFromXml(xmlDataProvider); var databaseProductCreator = DataAccess.ProductsFromXml(xmlDataProvider); // Returns array of Product model objects var XmlProducts = databaseProductCreator.Products(); // Returns array of Product model objects var DbProducts = xmlProductCreator.Products(); So my question is, is this a good structure for my Data Access layer? Is it a good idea to use a Factory for building my Model objects from the data? Do you think I've misunderstood something? And are there any general patterns I should read up on for how to write my data access objects to create my Model objects?

    Read the article

  • #OOW 2012: Big Data and The Social Revolution

    - by Eric Bezille
    As what was saying Cognizant CSO Malcolm Frank about the "Futur of Work", and how the Business should prepare in the face of the new generation  not only of devices and "internet of things" but also due to their users ("The Millennials"), moving from "consumers" to "prosumers" :  we are at a turning point today which is bringing us to the next IT Architecture Wave. So this is no more just about putting Big Data, Social Networks and Customer Experience (CxM) on top of old existing processes, it is about embracing the next curve, by identifying what processes need to be improve, but also and more importantly what processes are obsolete and need to be get ride of, and new processes put in place. It is about managing both the hierarchical and structured Enterprise and its social connections and influencers inside and outside of the Enterprise. And this does apply everywhere, up to the Utilities and Smart Grids, where it is no more just about delivering (faster) the same old 300 reports that have grown over time with those new technologies but to understand what need to be looked at, in real-time, down to an hand full relevant reports with the KPI relevant to the business. It is about how IT can anticipate the next wave, and is able to answers Business questions, and give those capabilities in real-time right at the hand of the decision makers... This is the turning curve, where IT is really moving from the past decade "Cost Center" to "Value for the Business", as Corporate Stakeholders will be able to touch the value directly at the tip of their fingers. It is all about making Data Driven Strategic decisions, encompassed and enriched by ALL the Data, and connected to customers/prosumers influencers. This brings to stakeholders the ability to make informed decisions on question like : “What would be the best Olympic Gold winner to represent my automotive brand ?”... in a few clicks and in real-time, based on social media analysis (twitter, Facebook, Google+...) and connections link to my Enterprise data. A true example demonstrated by Larry Ellison in real-time during his yesterday’s key notes, where “Hardware and Software Engineered to Work Together” is not only about extreme performances but also solutions that Business can touch thanks to well integrated Customer eXperience Management and Social Networking : bringing the capabilities to IT to move to the IT Architecture Next wave. An example, illustrated also todays in 2 others sessions, that I had the opportunity to attend. The first session bringing the “Internet of Things” in Oil&Gaz into actionable decisions thanks to Complex Event Processing capturing sensors data with the ready to run IT infrastructure leveraging Exalogic for the CEP side, Exadata for the enrich datasets and Exalytics to provide the informed decision interface up to end-user. The second session showing Real Time Decision engine in action for ACCOR hotels, with Eric Wyttynck, VP eCommerce, and his Technical Director Pascal Massenet. I have to close my post here, as I have to go to run our practical hands-on lab, cooked with Olivier Canonge, Christophe Pauliat and Simon Coter, illustrating in practice the Oracle Infrastructure Private Cloud recently announced last Sunday by Larry, and developed through many examples this morning by John Folwer. John also announced today Solaris 11.1 with a range of network innovation and virtualization at the OS level, as well as many optimizations for applications, like for Oracle RAC, with the introduction of the lock manager inside Solaris Kernel. Last but not least, he introduced Xsigo Datacenter Fabric for highly simplified networks and storage virtualization for your Cloud Infrastructure. Hoping you will get ready to jump on the next wave, we are here to help...

    Read the article

  • Calculating percentiles in Excel with "buckets" data instead of the data list itself

    - by G B
    I have a bunch of data in Excel that I need to get certain percentile information from. The problem is that instead of having the data set made up of each value, I instead have info on the number of or "bucket" data. For example, imagine that my actual data set looks like this: 1,1,2,2,2,2,3,3,4,4,4 The data set that I have is this: Value No. of occurrences 1 2 2 4 3 2 4 3 Is there an easy way for me to calculate percentile information (as well as the median) without having to explode the summary data out to full data set? (Once I did that, I know that I could just use the Percentile(A1:A5, p) function) This is important because my data set is very large. If I exploded the data out, I would have hundreds of thousands of rows and I would have to do it for a couple of hundred data sets. Help!

    Read the article

  • How do I setup a WCF Data Service with an ADO.NET Entity Entity Model in another assembly?

    - by lsb
    Hi! I have an ASP.NET 4.0 website that has an Entity Data Model hooked up to WCF Data Service. When the Service and Model are in the same assembly everything works. Unfortunately, when I move the Model to another "shared" assembly (and change the namespace) the service compiles but throws a 500 error when launched in a browser. The reason I want to have the Model in a common assembly (lets call it RiaTest.Shared) is that I want share common validation code between the client and service (by checking "Reuse types in referenced assemblies" in the Advanced tab of the Add Service Reference dialog). Anyway, I've spent a couple of hours on this to no avail so any help in the regard would be appreciated...

    Read the article

< Previous Page | 19 20 21 22 23 24 25 26 27 28 29 30  | Next Page >