Search Results

Search found 58576 results on 2344 pages for 'consolidate data'.

Page 13/2344 | < Previous Page | 9 10 11 12 13 14 15 16 17 18 19 20  | Next Page >

  • Big DataData Mining with Hive – What is Hive? – What is HiveQL (HQL)? – Day 15 of 21

    - by Pinal Dave
    In yesterday’s blog post we learned the importance of the operational database in Big Data Story. In this article we will understand what is Hive and HQL in Big Data Story. Yahoo started working on PIG (we will understand that in the next blog post) for their application deployment on Hadoop. The goal of Yahoo to manage their unstructured data. Similarly Facebook started deploying their warehouse solutions on Hadoop which has resulted in HIVE. The reason for going with HIVE is because the traditional warehousing solutions are getting very expensive. What is HIVE? Hive is a datawarehouseing infrastructure for Hadoop. The primary responsibility is to provide data summarization, query and analysis. It  supports analysis of large datasets stored in Hadoop’s HDFS as well as on the Amazon S3 filesystem. The best part of HIVE is that it supports SQL-Like access to structured data which is known as HiveQL (or HQL) as well as big data analysis with the help of MapReduce. Hive is not built to get a quick response to queries but it it is built for data mining applications. Data mining applications can take from several minutes to several hours to analysis the data and HIVE is primarily used there. HIVE Organization The data are organized in three different formats in HIVE. Tables: They are very similar to RDBMS tables and contains rows and tables. Hive is just layered over the Hadoop File System (HDFS), hence tables are directly mapped to directories of the filesystems. It also supports tables stored in other native file systems. Partitions: Hive tables can have more than one partition. They are mapped to subdirectories and file systems as well. Buckets: In Hive data may be divided into buckets. Buckets are stored as files in partition in the underlying file system. Hive also has metastore which stores all the metadata. It is a relational database containing various information related to Hive Schema (column types, owners, key-value data, statistics etc.). We can use MySQL database over here. What is HiveSQL (HQL)? Hive query language provides the basic SQL like operations. Here are few of the tasks which HQL can do easily. Create and manage tables and partitions Support various Relational, Arithmetic and Logical Operators Evaluate functions Download the contents of a table to a local directory or result of queries to HDFS directory Here is the example of the HQL Query: SELECT upper(name), salesprice FROM sales; SELECT category, count(1) FROM products GROUP BY category; When you look at the above query, you can see they are very similar to SQL like queries. Tomorrow In tomorrow’s blog post we will discuss about very important components of the Big Data Ecosystem – Pig. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

    Read the article

  • Data Masking for Oracle E-Business Suite

    - by Troy Kitch
    E-Business Suite customers can now use Oracle Data Masking to obscure sensitive information in non-production environments. Many organizations are inadvertently exposed when copying sensitive or regulated production data into non-production database environments for development, quality assurance or outsourcing purposes. Due to weak security controls and unmonitored access, these non-production environments have increasingly become the target of cyber criminals. Learn more about the announcement here.

    Read the article

  • Extracting Data from a Source System to History Tables

    - by Derek D.
    This is a topic I find very little information written about, however it is very important that the method for extracting data be done in a way that does not hinder performance of the source system.  In this example, the goal is to extract data from a source system, into another database (or server) all [...]

    Read the article

  • Importing Excel data into SSIS 2008 using Data Conversion Transformation

    Despite its benefits, SQL Server Integration Services Import Export Wizard has a number of limitations, resulting in part from a new set of rules that eliminate implicit data type conversion mechanisms present in Data Transformation Services. This article discusses a method that addresses such limitations, focusing in particular on importing the content of Excel spreadsheets into SQL Server.

    Read the article

  • Ma este Oracle Data Mining újdonságok webcast!

    - by Fekete Zoltán
    2010. május 12-én szerdán 18 órakor a böngészonkkel kapcsolódva a következo roppant érdekes eloadást hallgathatjuk meg az Oracle BIWA keretében: BIWA SIG TechCast Series - May 12 - Data Mining Made Easy, az eloadó Charlie Berger, az Oracle adatbányászati vezetoje. Könnyen elvégezheto adatbányászat! Az Oracle Data Miner 11g Release 2 új "Work flow" grafikus felületének bevezetése. Csatlakozni az Oracle BIWA-hoz a ezen a linken ingyenesen lehet. Itt találhatjuk meg, hogyan lehet meghallgatni ezt a konferenciát: www.oraclebiwa.org

    Read the article

  • Google doesn't always show rich snippets when the site uses structured data [duplicate]

    - by Sam Se
    This question is an exact duplicate of: Google Structured Data [on hold] 1 answer I'm so tired of the Google structured data recipe. After some days, it loses the image and the extra information. Then I test it again, and it shows again. Some other days in the future it might go away even if it is still showing in test tool. What i can do? I tried with RDFa and schema.org microdata.

    Read the article

  • Design patterns to avoiding breaking the SRP while performing heavy data logging

    - by Kazark
    A class that performs both computations and data logging seems to have at least two responsibilities. Given a system for which the specifications require heavy data logging, what kind of design patterns or architectural patterns can be used to avoid bloating all the classes with logging calls every time they compute something? The decorator pattern be used (e.g. Interpolator decorated to LoggingInterpolator), but it seems that would result in a situation hardly more desirable in which almost every major class would need to be decorated with logging.

    Read the article

  • Big data: An evening in the life of an actual buyer

    - by Jean-Pierre Dijcks
    Here I am, and this is an actual story of one of my evenings, trying to spend money with a company and ultimately failing. I just gave up and bought a service from another vendor, not the incumbent. Here is that story and how I think big data could actually fix this (and potentially prevent some of this from happening). In the end this story should illustrate how big data can benefit me (get me what I want without causing grief) and the company I am trying to buy something from. Note: Lots of details left out, I have no intention of being the annoyed blogger moaning about a specific company. What did I want to get? We watch TV, we have internet and we do have a land line. The land line is from a different vendor then the TV and the internet. I have decided that this makes no sense and I was going to get a bundle (no need to infer who this is, I just picked the generic bundle word as this is what I want to get) of all three services as this seems to save me money. I also want to not talk to people, I just want to click on a website when I feel like it and get it all sorted. I do think that is reality. I want to just do my shopping at 9.30pm while watching silly reruns on TV. Problem 1 - Bad links So, I'm an existing customer of the company I want to buy my bundle from. I go to the website, I click on offers. Turns out they are offers for new customers. After grumbling about how good they are, I click on offers for existing customers. Bummer, it goes to offers for new customers, so I click again on the link for offers for existing customers. No cigar... it just does not work. Big data solutions: 1) Do not show an existing customer the offers for new customers unless they are the same => This is only partially doable without login, but if a customer logs in the application should always know that this is an existing customer. But in general, imagine I do this from my home going through the internet service of this vendor to their domain... an instant filter should move me into the "existing customer route". 2) Flag dead or incorrect links => I've clicked the link for "existing customer offers" at least 3 times in under 5 seconds... Identifying patterns like this is easy in Hadoop and can very quickly make a list of potentially incorrect links. No need for realtime fixing, just the fact that this link can be pro-actively fixed across my entire web domain is a good thing. Preventative maintenance! Problem 2 - Purchase cannot be completed Apart from the fact that the browsing pattern to actually get to what I want is poorly designed, my purchase never gets past a specific point. In other words, I put something into my shopping cart and when I want to move on the application either crashes (with me going to an error page) or hangs or goes into something like chat. So I try again, and again and again. I think I tried this entire path (while being logged in!!) at least 10 times over the course of 20 minutes. I also clicked on the feedback button and, frustrated as I was, tried to explain this did not work... Big Data Solutions: 1) This web site does shopping cart analysis. I got an email next day stating I have things in my shopping cart, just click here to complete my purchase. After the above experience, this just added insult to my pain... 2) What should have happened, is a Hadoop job going over all logged in customers that are on the buy flow. It should flag anyone who is trying (multiple attempts from the same user to do the same thing), analyze the shopping card, the clicks to identify what the customers wants, his feedback provided (note: always own your own website feedback, never just farm this out!!) and in a short turn around time (30 minutes to 2 hours or so) email me with a link to complete my purchase. Not with a link to my shopping cart 12 hours later, but a link to actually achieve what I wanted... Why should this company go through the big data effort? I do believe this is relatively easy to do using our Oracle Event Processing and Big Data Appliance solutions combined. It is almost so simple (to my mind) that it makes no sense that this is not in place? But, now I am ranting... Why is this interesting? It is because of $$$$. After trying really hard, I mean I did this all in the evening, and again in the morning before going to work. I kept on failing, But I really wanted this to work... so an email that said, sorry, we noticed you tried to get a bundle (the log knows what I wanted, where I failed, so easy to generate), here is the link to click and complete your purchase. And here is 2 movies on us as an apology would have kept me as a customer, and got the additional $$$$ per month for the next couple of years. It would also lead to upsell on my phone package etc. Instead, I went to a completely different company, bought service from them. Lost money for company A, negative sentiment for company A and me telling this story at the water cooler so I'm influencing more people to think negatively about company A. All in all, a loss of easy money, a ding in sentiment and image where a relatively simple solution exists and can be in place on the software I describe routinely in this blog... For those who are coming to Openworld and maybe see value in solving the above, or are thinking of how to solve this, come visit us in Moscone North - Oracle Red Lounge or in the Engineered Systems Showcase.

    Read the article

  • Classic vs universal Google analytics and loss of historical data

    - by iss42
    I'm keen to use some of the new features in Google Universal Analytics. I have an old site though that I don't want to lose the historical data for. The comparisons with historical data are interesting for example. However Google doesn't appear to allow you to change a property from the classic code to the new code. Am I missing something? I'm surprised this isn't a bigger issue for many other users.

    Read the article

  • Classic vs universal and loss of historical data

    - by iss42
    I'm keen to use some of the new features in Google Universal Analytics. I have an old site though that I don't want to lose the historical data for. The comparisons with historical data are interesting for example. However Google doesn't appear to allow you to change a property from the classic code to the new code. Am I missing something? I'm surprised this isn't a bigger issue for many other users.

    Read the article

  • Referencing Entity from external data model - Core Data

    - by Ben Reeves
    I have a external library which includes a core data model, I would like to add a new entity to this model which has a relationship with one of the entities from the library. I know I could modify the original, but is there a way to without needing to pollute the library? I tried just creating a new model with an entity named the same, but that doesn't work: * Terminating app due to uncaught exception 'NSInvalidArgumentException', reason: 'Can't merge models with two different entities named 'Host''

    Read the article

  • inserting new relationship data in core-data

    - by michael
    My app will allow users to create a personalised list of events from a large list of events. I have a table view which simply displays these events, tapping on one of them takes the user to the event details view, which has a button "add to my events". In this detailed view I own the original event object, retrieved via an NSFetchedResultsController and passed to the detailed view (via a table cell, the same as the core data recipes sample). I have no trouble retrieving/displaying information from this "event". I am then trying to add it to the list of MyEvents represented by a one to many (inverse) relationship: This code: NSManagedObjectContext *context = [event managedObjectContext]; MyEvents *myEvents = (MyEvents *)[NSEntityDescription insertNewObjectForEntityForName:@"MyEvents" inManagedObjectContext:context]; [myEvents addEventObject:event];//ERROR And this code (suggested below): //would this add to or overwrite the "list" i am attempting to maintain NSManagedObjectContext *context = [event managedObjectContext]; MyEvents *myEvents = (MyEvents *)[NSEntityDescription insertNewObjectForEntityForName:@"MyEvents" inManagedObjectContext:context]; NSMutableSet *myEvent = [myEvents mutableSetValueForKey:@"event"]; [myEvent addObject:event]; //ERROR Bot produce (at the line indicated by //ERROR): *** -[NSComparisonPredicate evaluateWithObject:]: message sent to deallocated instance Seems I may have missed something fundamental. I cant glean any more information through the use of debugging tools, with my knowledge of them. 1) Is this a valid way to compile and store an editable list like this? 2) Is there a better way? 3) What could possibly be the deallocated instance in error? -- I have now modified the Event entity to have a to-many relationship called "myEvents" which referrers to itself. I can add Events to this fine, and logging the object shows the correct memory addresses appearing for the relationship after a [event addMyEventObject:event];. The same failure happens right after this however. I am still at a loss to understand what is going wrong. This is the backtrace #0 0x01f753a7 in ___forwarding___ () #1 0x01f516c2 in __forwarding_prep_0___ () #2 0x01c5aa8f in -[NSFetchedResultsController(PrivateMethods) _preprocessUpdatedObjects:insertsInfo:deletesInfo:updatesInfo:sectionsWithDeletes:newSectionNames:treatAsRefreshes:] () #3 0x01c5d63b in -[NSFetchedResultsController(PrivateMethods) _managedObjectContextDidChange:] () #4 0x0002e63a in _nsnote_callback () #5 0x01f40005 in _CFXNotificationPostNotification () #6 0x0002bef0 in -[NSNotificationCenter postNotificationName:object:userInfo:] () #7 0x01bbe17d in -[NSManagedObjectContext(_NSInternalNotificationHandling) _postObjectsDidChangeNotificationWithUserInfo:] () #8 0x01c1d763 in -[NSManagedObjectContext(_NSInternalChangeProcessing) _createAndPostChangeNotification:withDeletions:withUpdates:withRefreshes:] () #9 0x01ba25ea in -[NSManagedObjectContext(_NSInternalChangeProcessing) _processRecentChanges:] () #10 0x01bdfb3a in -[NSManagedObjectContext processPendingChanges] () #11 0x01bd0957 in _performRunLoopAction () #12 0x01f4d252 in __CFRunLoopDoObservers () #13 0x01f4c65f in CFRunLoopRunSpecific () #14 0x01f4bc48 in CFRunLoopRunInMode () #15 0x0273878d in GSEventRunModal () #16 0x02738852 in GSEventRun () #17 0x002ba003 in UIApplicationMain () solution I managed to get to the bottom of this. I was fetching the event in question using a NSFetchedResultsController with a NSPredicate which I was releasing after I had the results. Retrieving values from the entities returned was no problem, but when I tried to update any of them it gave the error above. It should not have been released. oustanding part of my question What is a good way to create this sub list from a list of existing items in terms of a core data model. I don't believe its any of the ways I tried here. I need to show/edit it in another table view. Perhaps there is a better way than a boolean property on each event entity? The relationship idea above doesn't seem to work here (even though I can now create it). Cheers.

    Read the article

  • Proper Use Of HTML Data Attributes

    - by VirtuosiMedia
    I'm writing several JavaScript plugins that are run automatically when the proper HTML markup is detected on the page. For example, when a tabs class is detected, the tabs plugin is loaded dynamically and it automatically applies the tab functionality. Any customization options for the JavaScript plugin are set via HTML5 data attributes, very similar to what Twitter's Bootstrap Framework does. The appeal to the above system is that, once you have it working, you don't have worry about manually instantiating plugins, you just write your HTML markup. This is especially nice if people who don't know JavaScript well (or at all) want to make use of your plugins, which is one of my goals. This setup has been working very well, but for some plugins, I'm finding that I need a more robust set of options. My choices seem to be having an element with many data-attributes or allowing for a single data-options attribute with a JSON options object as a value. Having a lot of attributes seems clunky and repetitive, but going the JSON route makes it slightly more complicated for novices and I'd like to avoid full-blown JavaScript in the attributes if I can. I'm not entirely sure which way is best. Is there a third option that I'm not considering? Are there any recommended best practices for this particular use case?

    Read the article

  • Data classes: getters and setters or different method design

    - by Frog
    I've been trying to design an interface for a data class I'm writing. This class stores styles for characters, for example whether the character is bold, italic or underlined. But also the font-size and the font-family. So it has different types of member variables. The easiest way to implement this would be to add getters and setters for every member variable, but this just feels wrong to me. It feels way more logical (and more OOP) to call style.format(BOLD, true) instead of style.setBold(true). So to use logical methods insteads of getters/setters. But I am facing two problems while implementing these methods: I would need a big switch statement with all member variables, since you can't access a variable by the contents of a string in C++. Moreover, you can't overload by return type, which means you can't write one getter like style.getFormatting(BOLD) (I know there are some tricks to do this, but these don't allow for parameters, which I would obviously need). However, if I would implement getters and setters, there are also issues. I would have to duplicate quite some code because styles can also have a parent styles, which means the getters have to look not only at the member variables of this style, but also at the variables of the parent styles. Because I wasn't able to figure out how to do this, I decided to ask a question a couple of weeks ago. See Object Oriented Programming: getters/setters or logical names. But in that question I didn't stress it would be just a data object and that I'm not making a text rendering engine, which was the reason one of the people that answered suggested I ask another question while making that clear (because his solution, the decorator pattern, isn't suitable for my problem). So please note that I'm not creating my own text rendering engine, I just use these classes to store data. Because I still haven't been able to find a solution to this problem I'd like to ask this question again: how would you design a styles class like this? And why would you do that? Thanks on forehand!

    Read the article

  • Are the only types of data "sources" static and dynamic?

    - by blunders
    Thinking that there might be others, but not sure -- but before getting into that, let me explain what I mean by static and dynamic data sources. Static (or datastore) - Meaning that the data's state is non-changing, and if was changed, that would be a new state, and the old data would be considered stateless; meaning it no longer is known to exist, or not exist. Another way of possibly looking at a static data source might be that if read and written back without modification, the checksum for before and after should be exactly the same regardless of the duration of time between the reading and rewriting of the data. Examples: Photos, Files, Database Record, Dynamic (or datastream) - Meaning that the data's state is known to be in flux, and never expected to be the same per input. Example: Live video/audio feed, Stock Market feed, First let me say, the above is a very loose mapping of the concepts, and I'd welcome any feedback. Next, onto the core of the question, that being are these the only two types of data sources. My guess, is that yes, they are -- but that there are hybrid versions of the two. That being, streaming data that has a fixed state. For example, the data being streamed has a checksum given and each unique checksum is known to be a single instance of static data. On the flip side, static data could be chained via say a version control system; when played back, each version might be viewed as a segment of a stream; thing is, the very fact that it can be played back makes the data source static. Another type might be that the data source is being organically discovered, and it's simply unknown what the state is. Questions, feedback, requests -- just comment, thanks!!

    Read the article

  • How to make the members of my Data Access Layer object aware of their siblings

    - by Graham
    My team currently has a project with a data access object composed like so: public abstract class DataProvider { public CustomerRepository CustomerRepo { get; private set; } public InvoiceRepository InvoiceRepo { get; private set; } public InventoryRepository InventoryRepo { get; private set; } // couple more like the above } We have non-abstract classes that inherit from DataProvider, and the type of "CustomerRepo" that gets instantiated is controlled by that child class. public class FloridaDataProvider { public FloridaDataProvider() { CustomerRepo = new FloridaCustomerRepo(); // derived from base CustomerRepository InvoiceRepo = new InvoiceRespository(); InventoryRepo = new InventoryRepository(); } } Our problem is that some of the methods inside a given repo really would benefit from having access to the other repo's. Like, a method inside InventoryRepository needs to get to Customer data to do some determinations, so I need to pass in a reference to a CustomerRepository object. Whats the best way for these "sibling" repos to be aware of each other and have the ability to call each other's methods as-needed? Virtually all the other repos would benefit from having the CustomerRepo, for example, because it is where names/phones/etc are selected from, and these data elements need to be added to the various objects that are returned out of the other repos. I can't just new-up a plain "CustomerRepository" object inside a method within a different repo, because it might not be the base CustomerRepository that actually needs to run.

    Read the article

  • a flexible data structure for geometries

    - by AkiRoss
    What data structure would you use to represent meshes that are to be altered (e.g. adding or removing new faces, vertices and edges), and that have to be "studied" in different ways (e.g. finding all the triangles intersecting a certain ray, or finding all the triangles "visible" from a given point in the space)? I need to consider multiple aspects of the mesh: their geometry, their topology and spatial information. The meshes are rather big, say 500k triangles, so I am going to use the GPU when computations are heavy. I tried using arrays with vertices and arrays with indices, but I do not love adding and removing vertices from them. Also, using arrays totally ignore spatial and topological information, which I may need studying the mesh. So, I thought about using custom double-linked list data structures, but I believe doing so will require me to copy the data to array buffers before going on the GPU. I also thought about using BST, but not sure it fits. Any help is appreciated. If I have been too fuzzy and you require other information feel free to ask.

    Read the article

  • Recover Lost data/partition

    - by Kaleido
    This is what happened: I was running 12.04.1 and wanted to install 12.10. upgrade, but a fresh install. When setting up my computer I anticipated for this by dividing my 640GB HD in following partitions: 1. 60 GB for Ubuntu, boot 2. 576 GB for data, mountpoint /home 3. swap, 4GB The idea was to manually select the correct partition in the installer but I got distracted for a moment and selected the wrong option. Result: The installer started repartitioning the entire HD. When I noticed this I interrupted the installer, but upon reboot it was clear that I was too late, no OS to boot to. I booted from a Gparted Live CD to see if I could recover the data on my previous /home-partition, but it's gone. Is there any way to recover the lost data? I searched around and read alot about Testdisk, but in all the tutorials I've seen, the set-up has been much easier than what I'm facing. I've not only lost my partition table, it's been replaced. Thanks in advance for any ideas that might help! If extra info is needed, please specify and I will do my best to provide.

    Read the article

  • Dynamic Data Associate Related Table Value?

    - by davemackey
    I have create a LINQ-to-SQL project in Visual Studio 2010 using Dynamic Data. In this project I have two tables. One is called phones_extension and the other phones_ten. The list of columns in phones_extension looks like this: id, extension, prefix, did_flag, len, ten_id, restriction_class_id, sfc_id, name_display, building_id, floor, room, phone_id, department_id In phones_ten it looks like this: id, name, pbxid Now, I'd like to be able to somehow make it so that there is an association (or inheritance?) that essentially results in me being able to make a query like phones_extension.ten and it gives me the result of phones_ten.name. Right now I have to get phones_extension.ten_id and then match that against phones_ten.id - I'm trying to get the DBML to handle this translation automatically. Is this possible?

    Read the article

  • Core Data grouping data in table

    - by OscarTheGrouch
    I am using core data trying to create a simple database app, I have an entity called "Game" which has a "creator". I have basically used the iPhone table view template and replaced the names. I have the games listed by creator. Currently the tableview looks like this... Chris Ryder Chris Ryder Chris Ryder Chris Ryder Dan Grimaldi Dan Grimaldi Dan Grimaldi Scott Ricardo Tim Thermos Tim Thermos I am trying to group the tableview, so that each creator has only one cell in the tableview and is listed once and only once like this... Chris Ryder Dan Grimaldi Scott Ricardo Tim Thermos any help or suggestions would be greatly appreciated.

    Read the article

  • Insert new relationship data in core data

    - by michael
    Hoping someone can shed some light on what I might be doing wrong here. Trying to add an "event" to a list of events, represented by a one to many (inverse) relationship: MyEvents <--- Event This is the code: MyEvents *myEvents = (MyEvents *)[NSEntityDescription insertNewObjectForEntityForName:@"MyEvents" inManagedObjectContext:context]; NSLog(@"MYEVENTS: %@", myEvents); NSLog(@"EVENT: %@", event); [myEvents addEventObject:event]; My context is fine, and both the myEvents and event print perfectly valid information. When I try to add the event that I have (which is passed into this view controller, having been retrieved from core data previously) with this code [myEvents addEventObject:event]; It falls over with *** -[NSComparisonPredicate evaluateWithObject:]: message sent to deallocated instance MyEvents and Event are just the default generated code. MyEvents contains only the relationship to event. Thanks.

    Read the article

< Previous Page | 9 10 11 12 13 14 15 16 17 18 19 20  | Next Page >