application data - Page 14

convert unstructured data to structured data?

- by Codeguru

How to convert unstructured data into structured data?For example email,contacts to a structured format. Are there any algorithms to do this??

Keeping iPhone application in sync with GWT application.

Hi, I'm working on an iPhone application that should work in offline and online modes. In it's online mode it's supposed to feed all the information the user enters to a webservice backed by GWT/GAE. In it's offline mode it's supposed to store the information locally, and when connection is available sync it up to the web service. Currently my plan is as follows: Provide a connection between an app and a webservice using Protobuffers for efficient over-the-wire communication Work with local DB using Core Data Poll the network status, and when available sync the database and keep some sort of local-db-to-remote-db key synchronization. The question is - am I in the right direction? Are the standard patterns for implementing this? Maybe someone can point me to an open-source application that works in a similar fashion? I am really new to iPhone coding, and would be very glad to hear any suggestions. Thanks

Read the article

When is the Data Vault model the right model for a data-warehouse?

- by Stephan Eggermont

I recently found a reference to 'Data Vault modeling' as a model for data-warehouses. The models I've seen before are Inmon and Kimball. The author refers to possible performance problems due to the joins needed. It looks like a nice model, but I wonder about the gotcha's. Are there any experience reports on-line?

Read the article

[Visual C++]Forcing memory alignment of variables/data-structures

- by John

I'm looking at using SSE and I gather aligning data on 16byte boundaries is recommended. There are two cases to consider: float data[4]; struct myystruct { float x,y,z,w; }; I'm not sure the first case can be done explicitly, though there's perhaps a compiler option I could use? In the second case I remember being able to control packing in old versions of GCC several years back, is this still possible?

Read the article

SQL SERVER – Data Pages in Buffer Pool – Data Stored in Memory Cache

- by pinaldave

This will drop all the clean buffers so we will be able to start again from there. Now, run the following script and check the execution plan of the query. Have you ever wondered what types of data are there in your cache? During SQL Server Trainings, I am usually asked if there is any way one can know how much data in a table is stored in the memory cache? The more detailed question I usually get is if there are multiple indexes on table (and used in a query), were the data of the single table stored multiple times in the memory cache or only for a single time? Here is a query you can run to figure out what kind of data is stored in the cache. USE AdventureWorks GO SELECT COUNT(*) AS cached_pages_count, name AS BaseTableName, IndexName, IndexTypeDesc FROM sys.dm_os_buffer_descriptors AS bd INNER JOIN ( SELECT s_obj.name, s_obj.index_id, s_obj.allocation_unit_id, s_obj.OBJECT_ID, i.name IndexName, i.type_desc IndexTypeDesc FROM ( SELECT OBJECT_NAME(OBJECT_ID) AS name, index_id ,allocation_unit_id, OBJECT_ID FROM sys.allocation_units AS au INNER JOIN sys.partitions AS p ON au.container_id = p.hobt_id AND (au.type = 1 OR au.type = 3) UNION ALL SELECT OBJECT_NAME(OBJECT_ID) AS name, index_id, allocation_unit_id, OBJECT_ID FROM sys.allocation_units AS au INNER JOIN sys.partitions AS p ON au.container_id = p.partition_id AND au.type = 2 ) AS s_obj LEFT JOIN sys.indexes i ON i.index_id = s_obj.index_id AND i.OBJECT_ID = s_obj.OBJECT_ID ) AS obj ON bd.allocation_unit_id = obj.allocation_unit_id WHERE database_id = DB_ID() GROUP BY name, index_id, IndexName, IndexTypeDesc ORDER BY cached_pages_count DESC; GO Now let us run the query above and observe the output of the same. We can see in the above query that there are four columns. Cached_Pages_Count lists the pages cached in the memory. BaseTableName lists the original base table from which data pages are cached. IndexName lists the name of the index from which pages are cached. IndexTypeDesc lists the type of index. Now, let us do one more experience here. Please note that you should not run this test on a production server as it can extremely reduce the performance of the database. DBCC DROPCLEANBUFFERS This will drop all the clean buffers and we will be able to start again from there. Now run following script and check the execution plan for the same. USE AdventureWorks GO SELECT UnitPrice, ModifiedDate FROM Sales.SalesOrderDetail WHERE SalesOrderDetailID BETWEEN 1 AND 100 GO The execution plans contain the usage of two different indexes. Now, let us run the script that checks the pages cached in SQL Server. It will give us the following output. It is clear from the Resultset that when more than one index is used, datapages related to both or all of the indexes are stored in Memory Cache separately. Let me know what you think of this article. I had a great pleasure while writing this article because I was able to write on this subject, which I like the most. In the next article, we will exactly see what data are cached and those that are not cached, using a few undocumented commands. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: DMV, Pinal Dave, SQL, SQL Authority, SQL Optimization, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: SQL DMV

Read the article

Bitmask data insertions in SSDT Post-Deployment scripts

- by jamiet

On my current project we are using SQL Server Data Tools (SSDT) to manage our database schema and one of the tasks we need to do often is insert data into that schema once deployed; the typical method employed to do this is to leverage Post-Deployment scripts and that is exactly what we are doing. Our requirement is a little different though, our data is split up into various buckets that we need to selectively deploy on a case-by-case basis. I was going to use a SQLCMD variable for each bucket (defaulted to some value other than “Yes”) to define whether it should be deployed or not so we could use something like this in our Post-Deployment script: IF ($(DeployBucket1Flag) = 'Yes')BEGIN :r .\Bucket1.data.sqlENDIF ($(DeployBucket2Flag) = 'Yes')BEGIN :r .\Bucket2.data.sqlENDIF ($(DeployBucket3Flag) = 'Yes')BEGIN :r .\Bucket3.data.sqlEND That works fine and is, I’m sure, a very common technique for doing this. It is however slightly ugly because we have to litter our deployment with various SQLCMD variables. My colleague James Rowland-Jones (whom I’m sure many of you know) suggested another technique – bitmasks. I won’t go into detail about how this works (James has already done that at Using a Bitmask - a practical example) but I’ll summarise by saying that you can deploy different combinations of the buckets simply by supplying a different numerical value for a single SQLCMD variable. Each bit of that value’s binary representation signifies whether a particular bucket should be deployed or not. This is better demonstrated using the following simple script (which can be easily leveraged inside your Post-Deployment scripts): /* $(DeployData) is a SQLCMD variable that would, if you were using this in SSDT, be declared in the SQLCMD variables section of your project file. It should contain a numerical value, defaulted to 0. In this example I have declared it using a :setvar statement. Test the affect of different values by changing the :setvar statement accordingly. Examples: :setvar DeployData 1 will deploy bucket 1 :setvar DeployData 2 will deploy bucket 2 :setvar DeployData 3 will deploy buckets 1 & 2 :setvar DeployData 6 will deploy buckets 2 & 3 :setvar DeployData 31 will deploy buckets 1, 2, 3, 4 & 5 */ :setvar DeployData 0 DECLARE @bitmask VARBINARY(MAX) = CONVERT(VARBINARY,$(DeployData)); IF (@bitmask & 1 = 1) BEGIN PRINT 'Bucket 1 insertions'; END IF (@bitmask & 2 = 2) BEGIN PRINT 'Bucket 2 insertions'; END IF (@bitmask & 4 = 4) BEGIN PRINT 'Bucket 3 insertions'; END IF (@bitmask & 8 = 8) BEGIN PRINT 'Bucket 4 insertions'; END IF (@bitmask & 16 = 16) BEGIN PRINT 'Bucket 5 insertions'; END An example of running this using DeployData=6 The binary representation of 6 is 110. The second and third significant bits of that binary number are set to 1 and hence buckets 2 and 3 are “activated”. Hope that makes sense and is useful to some of you! @Jamiet P.S. I used the awesome HTML Copy feature of Visual Studio’s Productivity Power Tools in order to format the T-SQL code above for this blog post.

Read the article

Web-Applikationen entwickeln mit Oracle Application Express

- by britta.wolf

Mit Oracle Application Express können schnell und einfach datenbankgestützte Web-Anwendungen erstellt werden. Das Tool wird kostenlos (!) über das Oracle Technology Network (OTN) bereitgestellt und kann für alle Zwecke eingesetzt werden. Wer sich über das relativ umfangreiche APEX-Material auf OTN hinaus informieren möchte, dem empfehle ich wärmstens die deutschsprachige APEX-Community-Seite. Die Webseite wird super gepflegt und bietet Einsteigerinformationen, regelmäßige Neuigkeiten, nützliche Tipps und weiterführende Links, sowie Hinweise zu Community-Treffen oder Web-Sessions. Es besteht derzeit die Möglichkeit, auf die neueste Version, die sogenannte Early Adopter Version (EA Phase2) von APEX 4.0 zuzugreifen: http://tryapexnow.com/ Zuerst müssen Sie sich registrieren. Danach fordern Sie Ihren persönlichen Workspace (Arbeitsbereich) an. Innerhalb weniger Minuten bekommen Sie per Mail die Bestätigung und eine Aufforderung, die Account-Erstellung zu finalisieren. Und dann kann es los gehen...!

Read the article

Add a Customizable, Free Application Launcher to your Windows Desktop

- by Lori Kaufman

RocketDock is an application launcher for Windows modeled after the Mac OS X launch toolbar. It’s a dock that sits along an edge of your screen and contains a collection of shortcuts that expand when you hover over them and launch programs when clicked. You can easily add shortcuts to programs, files, documents, folders, and even actions to the dock. The look of the dock is customizable using themes and icons. Docklets are available to help extend the functionality of your dock. We’ll show you how to install RocketDock, change the dock settings, add shortcuts to the dock, change the settings for shortcut icons, and add new themes to your dock. We’ll also show you how to install and setup a docklet, using the Stacks docklet as an example. HTG Explains: What Is RSS and How Can I Benefit From Using It? HTG Explains: Why You Only Have to Wipe a Disk Once to Erase It HTG Explains: Learn How Websites Are Tracking You Online

Read the article

Looking for Cutting-Edge Data Integration: 2014 Excellence Awards

- by Sandrine Riley

Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 It is nomination time!!! This year's Oracle Fusion Middleware Excellence Awards will honor customers and partners who are creatively using various products across Oracle Fusion Middleware. Think you have something unique and innovative with one or a few of our Oracle Data Integration products? We would love to hear from you! Please submit today. The deadline for the nomination is June 20, 2014. What you win: An Oracle Fusion Middleware Innovation trophy One free pass to Oracle OpenWorld 2014 Priority consideration for placement in Profit magazine, Oracle Magazine, or other Oracle publications & press release Oracle Fusion Middleware Innovation logo for inclusion on your own Website and/or press release Let us reminisce a little… For details on the 2013 Data Integration Winners: Royal Bank of Scotland’s Market and International Banking and The Yalumba Wine Company, check out this blog post: 2013 Oracle Excellence Awards for Fusion Middleware Innovation… and the Winners for Data Integration are… and for details on the 2012 Data Integration Winners: Raymond James and Morrisons, check out this blog post: And the Winners of Fusion Middleware Innovation Awards in Data Integration are… Now to view the 2013 Winners (for all categories). We hope to honor you! Here's what you need to do: Click here to submit your nomination today. And just a reminder: the deadline to submit a nomination is 5pm Pacific Time on June 20, 2014. /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin;}

Read the article

How to fill certain application design learning "gaps"?

- by e4rthdog

In life it doesnt matter if you do one thing for 15 years. You will end up waking one day and asking stuff that are equal to "how do i walk?" :) My specific question is that as a new entrant to C# and OOP i am stepping into many little "details" that need to be addressed. Written a lot of code in VB.NET / cobol / simple php e.t.c surely does not help much into the OOP world... So , even after reading entry level books for C# and watching some videos i recently found out about the "factory model design" for applications. I would appreciate if any of you guys recomment some reading on application design architecture for further reading...

Read the article

Recommended language and IDE for simple linux application [on hold]

- by niklon

I want to write a simple program on Debian with Gnome. Application will act as a side bar, giving simple information on online servers statuses. I preferably have a black transparent background(Terminal-like). I'm asking this question because I was previously writing programs in .NET C# for myself, and now I don't want to get to Mono, but something more conventional. What language should I choose for this task? What would be the recommended way to do it?(eg. what IDE)

Read the article

Error accessing Gio.Gsettings on application made in quickly

- by Zane Swafford

I am trying to develop an application using the quickly/pygtk stack. I got my Gsettings schemas all set up in ~/app-name-here/data/glib-2.0/schemas/net.launchpad.app-name-here.gschema.xml correctly and I am able to access it just fine in my preferences dialog window that is located in ~/app-name-here/app-name-here/PreferencesDialog.py via from gi.repository import Gtk, Gio settings = Gio.Settings("net.launchpad.app-name-here") settings.get_boolean('notify') settings.set_boolean('notify', True) but when I try to check the value of one of my settings in a file located in ~/app-name-here/bin/Daemon.py that I use as a script to run in the background and send notifications by a similar method of from gi.repository import Gio settings = Gio.Settings("net.launchpad.app-name-here") settings.get_boolean('notify') it fails at the line that says settings = Gio.Settings("net.launchpad.app-name-here") and spits out a nasty error (Daemon.py:26100): GLib-GIO-ERROR **: Settings schema 'net.launchpad.app-name-here' is not installed Despite the fact that I can open up dconf-editor and find the settings under net/launchpad/app-name-here. Any thoughts?

Read the article

Help building pygtk application

- by umpirsky

This is my application. It is created with quickly. I would like to package it for Ubuntu now. I tried to package it with uickly, but that failed. At first, I was trying to install it using setup.py. But it is only copied in python lib dir, no icon, no desktop file installed. Then I tried to follow this guide, but I ended up with package without icon and it was of bad quality, but most important of all, it does not use setup.py, and it was pretty complicated. I would like to automate packaging process Can someone point me in the right direction, some examples of existing apps that have automated packaging etc? Thanks in advance.

Read the article

Quickly Application making it run on startup

- by unknownone

I have a PyGtk application that I made using Quickly, and I would like to have it run on startup when installed. How would I go about doing this? I'm not sure if sticking the .desktop file in ~/.config/autostart/ would make it work or not. If that will fix it, I don't know how to add it to that folder since Quickly packages the project for you and has it's own installation script. Is it possible to modify what it does on installation? If possible, I would also like to add the program in the System Settings Personal tab, but I do not know how to do that. Thanks

Read the article

Make application automatically detect system language

- by hakermania

What should an application developed under a Linux System like Ubuntu do so as to automatically detect the system language? There are applications, like Liferea that automatically change their language to match the system's, without altering any preference of the program itself: Should this be the "default" behavior for all the programs? Should there be an option on the program so as to let the user choose the language nonetheless? Are all these translations coming along with the program itself? What if the user has set a system language not available in the translations of the program? Is this Ubuntu or most-linux-distros specific?

Read the article

Writing a desktop application for progammer from PHP background

- by Mark

I have a client who wants a tool for him to be able to upload his products, enter orders, and keep track of customer details. There are quite a few highly customised requests, which is why he wants the tool custum made. He does not care much about the interface design - it just has to be usable and provide access to the databade. I've already designed the database. I have no experience of desktop applications and usually write my web apps in PHP with the Yii framework. But hosting this on a server seems like overkill. I also have .net experience from a few years ago. What would be the best options for writing this as a desktop application?

Read the article

Online job application

- by Fred

I am trying to add an application to my site where I can post job openings with my company and allow people to apply online. Can someone recommend a service or app already in existence for this purpose? I tried googling it, but could not find a set of search terms that did not return endless sites for job seekers. This is a (very) small business and I do not expect to have more than a few openings at any time, but what I am actually interested in is having a repository of interested job seekers to have on file. Then when people ask me about openings, I could just refer them to the page and they could apply. Then, if we have an opening, we could look through the list of candidates and if we can't fill the position(s) from that list, we could post the job and advertise to fill the position.

Read the article

Help with Perl persistent data storage using Data::Dumper

- by stephenmm

I have been trying to figure this out for way to long tonight. I have googled it to death and none of the examples or my hacks of the examples are getting it done. It seems like this should be pretty easy but I just cannot get it. Here is the code: #!/usr/bin/perl -w use strict; use Data::Dumper; my $complex_variable = {}; my $MEMORY = "$ENV{HOME}/data/memory-file"; $complex_variable->{ 'key' } = 'value'; $complex_variable->{ 'key1' } = 'value1'; $complex_variable->{ 'key2' } = 'value2'; $complex_variable->{ 'key3' } = 'value3'; print Dumper($complex_variable)."TEST001\n"; open M, ">$MEMORY" or die; print M Data::Dumper->Dump([$complex_variable], ['$complex_variable']); close M; $complex_variable = {}; print Dumper($complex_variable)."TEST002\n"; # Then later to restore the value, it's simply: do $MEMORY; #eval $MEMORY; print Dumper($complex_variable)."TEST003\n"; And here is my output: $VAR1 = { 'key2' => 'value2', 'key1' => 'value1', 'key3' => 'value3', 'key' => 'value' }; TEST001 $VAR1 = {}; TEST002 $VAR1 = {}; TEST003 Everything that I read says that the TEST003 output should look identical to the TEST001 output which is exactly what I am trying to achieve. What am I missing here? Should I be "do"ing differently or should I be "eval"ing instead and if so how? Thanks for any help...

Read the article

Relational database data explorer / visualization?

- by Ian Boyd

Is there a tool that can let one browse relational data as a graph of connected nodes? For example, i'm faced with trying to cleanse some anomolous data. i can start with two offending rows. In this particular example, the TransactionID should, by business rules, be unique to the table, but i find a transaction that violates that rule: SELECT * FROM LCTTrans WHERE TransactionID = 1075048 LCTID TransactionID ========= ============= 4358 1075048 4359 1075048 2 row(s) affected But really what i want to begin to hunt down all the related data, to try to see which is right. So this hypothetical software would start by showing me these two rows: Next, i want to see that transaction that is linked into this table: Now that transaction points to an MAL, so show me that: Now lets add those two LCTs, that the transaction is "on". A transaction can be on only one LCT, yet this one is pointing to two: Okay computer, both of those LCTs point to an MAL and the transaction that created them, show me those: Those last two transactions, they also point at an MAL, and they themselves point to an LCT, show me those: Okay, now are there any entries in LCTTrans that point to LCTs 4358 or 4359?... And so on, and so on. Now i did all this manually, running single selects, copying and pasting uniqueidentifier keys and converting them into friendly id numbers so i could easily see the relationships. Is there software that can do this?

Read the article

How to properly set relationships in Core Data when using setValue and data already exists

- by ern

Let's say I have two objects: Articles and Categories. For the sake of this example all relevant categories have already been added to the data store. When looping through data that holds edits for articles, there is category relationship information that needs to be saved. I was planning on using the -setValue method in the Article class in order to set the relationships like so: - (void)setValue:(id)value forUndefinedKey:(NSString *)key { if([key isEqualToString:@"categories"]){ NSLog(@"trying to set categories..."); } } The problem is that value isn't a Category, it is just a string (or array of strings) holding the title of a category. I could certainly do a lookup within this method for each category and assign it, but that seems inefficient when processing a whole bunch of articles at once. Another option is to populate an array of all possible categories and just filter, but my question is where to store that array? Should it be a class method on Article? Is there a way to pass in additional data to the -setValue method? Is there another, better option for setting the relationship I'm not thinking of? Thanks for your help.

Read the article

Suggested Web Application Framework and Database for Enterprise, “Big-Data” App?

- by willOEM

I have a web application that I have been developing for a small group within my company over the past few years, using Pipeline Pilot (plus jQuery and Python scripting) for web development and back-end computation, and Oracle 10g for my RDBMS. Users upload experimental genomic data, which is parsed into a database, and made available for querying, transformation, and reporting. Experimental data sets are large and have many layers of metadata. A given experimental data record might have a foreign key relationship with a table that describes this data point's assay. Assays can cover multiple genes, which can have multiple transcript, which can have multiple mutations, which can affect multiple signaling pathways, etc. Users need to approach this data from any point in those layers in the metadata. Since all data sets for a given data type can run over a billion rows, this results in some large, dynamic queries that are hard to predict. New data sets are added on a weekly basis (~1GB per set). Experimental data is never updated, but the associated metadata can be updated weekly for a few records and yearly for most others. For every data set insert the system sees, there will be between 10 and 100 selects run against it and associated data. It is okay for updates and inserts to run slow, so long as queries run quick and are as up-to-date as possible. The application continues to grow in size and scope and is already starting to run slower than I like. I am worried that we have about outgrown Pipeline Pilot, and perhaps Oracle (as the sole database). Would a NoSQL database or an OLAP system be appropriate here? What web application frameworks work well with systems like this? I'd like the solution to be something scalable, portable and supportable X-years down the road. Here is the current state of the application: Web Server/Data Processing: Pipeline Pilot on Windows Server + IIS Database: Oracle 10g, ~1TB of data, ~180 tables with several billion-plus row tables Network Storage: Isilon, ~50TB of low-priority raw data

Read the article

when to introduce an application services tier in an n-tier application

- by user20358

I am developing a web based application whose primary objective is to fetch data from the database, display it on the UI, take in user inputs and write them back to the database. The application is not going to be doing any industrial strength algorithm crunching, but will be receiving a very high number of hits at peak times (described below) which will be changing thru the day. The layers are your typical Presentation, Business, Data. The data layer is taken care of by the database server. The business layer will contain the DAL component to access the database server over tcp. The choices I have to separate these layers into tiers are: The presentation and business layers can be either kept on the same tier. The presentation layer on a separate tier by itself and the business layer on a separate tier by itself. In the case of choice 2, the business layer will be accessed by the presentation layer using a WCF service either over http or tcp. I don't see any heavy processing being done on the Business layer, so I am leaning towards option 1 above. I also feel for the same reason, adding a new tier will only introduce the network latency. However, in terms of scalability in case I need to scale up or scale out, which is a better way to go? This application will need to be able to support up to 6 million users an hour. There will be a reasonable amount of data in each user session, storing user's preferences and other details. I will be using page level caching as well.

Read the article

Simple ADF page using BAM Data Control

- by [email protected]

var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www."); document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E")); try { var pageTracker = _gat._getTracker("UA-15829414-1"); pageTracker._trackPageview(); } catch(err) {} Purpose : In this blog I will walk you through very simple steps to create an ADF page using BAM data control connection.Details : Create the projectOpen JDeveloper (make sure you have installed the SOA extension for JDev)Create new Application using "Generic Application" template.Click on "Next"Shuttle "ADF Faces" to right pane for the project technology.Click "Finish"Create a BAM connectionIn the resource palette click on "Folder->New Connection -> BAM"Enter the connection name and click "Next"Enter Connection details Click on "Test connection" and "Finish"Create the BAM Data Control Open the IDE connection created in above step.Drag and drop "Employees" to "Data controls" palette.Select "Flat Query" and Click "Finish".Create the View Create a new JSF page.From Data control Panel drag and drop "Employees->Query->ADF Read Only table"Right click and Run the page.

Read the article

Data Aggregation of CSV files java

- by royB

I have k csv files (5 csv files for example), each file has m fields which produce a key and n values. I need to produce a single csv file with aggregated data. I'm looking for the most efficient solution for this problem, speed mainly. I don't think by the way that we will have memory issues. Also I would like to know if hashing is really a good solution because we will have to use 64 bit hashing solution to reduce the chance for a collision to less than 1% (we are having around 30000000 rows per aggregation). For example file 1: f1,f2,f3,v1,v2,v3,v4 a1,b1,c1,50,60,70,80 a3,b2,c4,60,60,80,90 file 2: f1,f2,f3,v1,v2,v3,v4 a1,b1,c1,30,50,90,40 a3,b2,c4,30,70,50,90 result: f1,f2,f3,v1,v2,v3,v4 a1,b1,c1,80,110,160,120 a3,b2,c4,90,130,130,180 algorithm that we thought until now: hashing (using concurentHashTable) merge sorting the files DB: using mysql or hadoop or redis. The solution needs to be able to handle Huge amount of data (each file more than two million rows) a better example: file 1 country,city,peopleNum england,london,1000000 england,coventry,500000 file 2: country,city,peopleNum england,london,500000 england,coventry,500000 england,manchester,500000 merged file: country,city,peopleNum england,london,1500000 england,coventry,1000000 england,manchester,500000 The key is: country,city. This is just an example, my real key is of size 6 and the data columns are of size 8 - total of 14 columns. We would like that the solution will be the fastest in regard of data processing.

Search Results

Search found 97400 results on 3896 pages for 'application data'.

Page 14/3896 | < Previous Page | 10 11 12 13 14 15 16 17 18 19 20 21 | Next Page >

- by Codeguru

- by Reflog

- by Stephan Eggermont

- by John

- by pinaldave

- by jamiet

- by britta.wolf

- by Lori Kaufman

- by Sandrine Riley

- by e4rthdog

- by e4rthdog

- by niklon

- by Zane Swafford

- by umpirsky

- by unknownone

- by hakermania

- by Mark

- by Fred

- by stephenmm

- by Ian Boyd

- by ern

- by willOEM

- by user20358

- by [email protected]

- by royB

< Previous Page | 10 11 12 13 14 15 16 17 18 19 20 21 | Next Page >