Search Results

Search found 58475 results on 2339 pages for 'data mining'.

Page 22/2339 | < Previous Page | 18 19 20 21 22 23 24 25 26 27 28 29  | Next Page >

  • How should I architect my Model and Data Access layer objects in my website?

    - by Robin Winslow
    I've been tasked with designing Data layer for a website at work, and I am very interested in architecture of code for the best flexibility, maintainability and readability. I am generally acutely aware of the value in completely separating out my actual Models from the Data Access layer, so that the Models are completely naive when it comes to Data Access. And in this case it's particularly useful to do this as the Models may be built from the Database or may be built from a Soap web service. So it seems to me to make sense to have Factories in my data access layer which create Model objects. So here's what I have so far (in my made-up pseudocode): class DataAccess.ProductsFromXml extends DataAccess.ProductFactory {} class DataAccess.ProductsFromDatabase extends DataAccess.ProductFactory {} These then get used in the controller in a fashion similar to the following: var xmlProductCreator = DataAccess.ProductsFromXml(xmlDataProvider); var databaseProductCreator = DataAccess.ProductsFromXml(xmlDataProvider); // Returns array of Product model objects var XmlProducts = databaseProductCreator.Products(); // Returns array of Product model objects var DbProducts = xmlProductCreator.Products(); So my question is, is this a good structure for my Data Access layer? Is it a good idea to use a Factory for building my Model objects from the data? Do you think I've misunderstood something? And are there any general patterns I should read up on for how to write my data access objects to create my Model objects?

    Read the article

  • how to find maximum frequent item sets from large transactional data file

    - by ANIL MANE
    Hi, I have the input file contains large amount of transactions like Transaction ID Items T1 Bread, milk, coffee, juice T2 Juice, milk, coffee T3 Bread, juice T4 Coffee, milk T5 Bread, Milk T6 Coffee, Bread T7 Coffee, Bread, Juice T8 Bread, Milk, Juice T9 Milk, Bread, Coffee, T10 Bread T11 Milk T12 Milk, Coffee, Bread, Juice i want the occurrence of every unique item like Item Name Count Bread 9 Milk 8 Coffee 7 Juice 6 and from that i want an a fp-tree now by traversing this tree i want the maximal frequent itemsets as follows The basic idea of method is to dispose nodes in each “layer” from bottom to up. The concept of “layer” is different to the common concept of layer in a tree. Nodes in a “layer” mean the nodes correspond to the same item and be in a linked list from the “Head Table”. For nodes in a “layer” NBN method will be used to dispose the nodes from left to right along the linked list. To use NBN method, two extra fields will be added to each node in the ordered FP-Tree. The field tag of node N stores the information of whether N is maximal frequent itemset, and the field count’ stores the support count information in the nodes at left. In Figure, the first node to be disposed is “juice: 2”. If the min_sup is equal to or less than 2 then “bread, milk, coffee, juice” is a maximal frequent itemset. Firstly output juice:2 and set the field tag of “coffee:3” as “false” (the field tag of each node is “true” initially ). Next check whether the right four itemsets juice:1 be the subset of juice:2. If the itemset one node “juice:1” corresponding to is the subset of juice:2 set the field tag of the node “false”. In the following process when the field tag of the disposed node is FALSE we can omit the node after the same tagging. If the min_sup is more than 2 then check whether the right four juice:1 is the subset of juice:2. If the itemset one node “juice:1” corresponding to is the subset of juice:2 then set the field count’ of the node with the sum of the former count’ and 2 After all the nodes “juice” disposed ,begin to dispose the node “coffee:3”. Any suggestions or available source code, welcome. thanks in advance

    Read the article

  • Calculating percentiles in Excel with "buckets" data instead of the data list itself

    - by G B
    I have a bunch of data in Excel that I need to get certain percentile information from. The problem is that instead of having the data set made up of each value, I instead have info on the number of or "bucket" data. For example, imagine that my actual data set looks like this: 1,1,2,2,2,2,3,3,4,4,4 The data set that I have is this: Value No. of occurrences 1 2 2 4 3 2 4 3 Is there an easy way for me to calculate percentile information (as well as the median) without having to explode the summary data out to full data set? (Once I did that, I know that I could just use the Percentile(A1:A5, p) function) This is important because my data set is very large. If I exploded the data out, I would have hundreds of thousands of rows and I would have to do it for a couple of hundred data sets. Help!

    Read the article

  • How do I setup a WCF Data Service with an ADO.NET Entity Entity Model in another assembly?

    - by lsb
    Hi! I have an ASP.NET 4.0 website that has an Entity Data Model hooked up to WCF Data Service. When the Service and Model are in the same assembly everything works. Unfortunately, when I move the Model to another "shared" assembly (and change the namespace) the service compiles but throws a 500 error when launched in a browser. The reason I want to have the Model in a common assembly (lets call it RiaTest.Shared) is that I want share common validation code between the client and service (by checking "Reuse types in referenced assemblies" in the Advanced tab of the Add Service Reference dialog). Anyway, I've spent a couple of hours on this to no avail so any help in the regard would be appreciated...

    Read the article

  • Knowledge mining using Hadoop.

    - by Anurag
    Hello there, I want to do a project Hadoop and map reduce and present it as my graduation project. To this, I've given some thought,searched over the internet and came up with the idea of implementing some basic knowledge mining algorithms say on a social websites like Facebook or may stckoverflow, Quora etc and draw some statistical graphs, comparisons frequency distributions and other sort of important values.For searching purpose would it be wise to use Apache Solr ? I want know If such thing is feasible using the above mentioned tools, if so how should I build up on this little idea? Where can I learn about knowledge mining algorithms which are easy to implement using java and map reduce techniques? In case this is a wrong idea please suggest what else can otherwise be done on using Hadoop and other related sub-projects? Thank you

    Read the article

  • Apriori Algorithm- what to do with small min.support?

    - by user3707650
    I have a question about the table beneath my question: If i was told that the given min.support=10%, how can i know what is the support count, by which i will use during the exercise? What i know is: that you take the number of transactions (8) and multiple it by the min.support: 8*(10/100)=0.8 the problem is that i get this number: 0.8, how can i use this support count during this example?? 0.8 is a number that will make me prune all combination set that i will build... please help me!!! TID A B C D E F G 10 1 0 1 0 0 0 1 20 1 1 1 1 0 1 1 30 0 0 0 0 0 0 1 40 0 0 1 0 0 1 1 50 0 0 0 1 1 0 0 60 0 1 1 0 1 1 0 70 0 0 0 0 1 1 0 80 0 0 1 0 1 1 1

    Read the article

  • Open Data, Government and Transparency

    - by Tori Wieldt
    A new track at TDC (The Developer's Conference in Sao Paulo, Brazil) is titled Open Data. It deals with open data, government and transparency. Saturday will be a "transparency hacker day" where developers are invited to create applications using open data from the Brazilian government.  Alexandre Gomes, co-lead of the track, says "I want to inspire developers to become "Civic hackers:" developers who create apps to make society better." It is a chance for developers to do well and do good. There are many opportunities for developers, including monitoring government expenditures and getting citizens involved via social networks. The open data movement is growing worldwide. One initiative, the Open Government Partnership, is working to make government data easier to find and access. Making this data easily available means that with the right applications, it will be easier for people to make decisions and suggestions about government policies based on detailed information. Last April, the Open Government Partnership held its annual meeting in Brasilia, the capitol of Brazil. It was a great success showcasing the innovative work being done in open data by governments, civil societies and individuals around the world. For example, Bulgaria now publishes daily data on budget spending for all public institutions. Alexandre Gomes Explains Open Data At TDC, the Open Data track will include a presentation of examples of successful open data projects, an introduction to the semantic web, how to handle big data sets, techniques of data visualization, and how to design APIs.The other track lead is Christian Moryah Miranda, a systems analyst for the Brazilian Government's Ministry of Planning. "The Brazilian government wholeheartedly supports this effort. In order to make our data available to the public, it forces us to be more consistent with our data across ministries, and that's a good step forward for us," he said. He explained the government knows they cannot achieve everything they would like without help from the public. "It is not the government versus the people, rather citizens are partners with the government, and together we can achieve great things!" Miranda exclaimed. Saturday at TDC will be a "transparency hacker day" where developers will be invited to create applications using open data from the Brazilian government. Attendees are invited to pitch their ideas, work in small groups, and present their project at the end of the conference. "For example," Gomes said, "the Brazilian government just released the salaries of all government employees and I can't wait to see what developers can do with that." Resources Open Government Partnership  U.S. Government Open Data ProjectBrazilian Government Open Data ProjectU.K. Government Open Data Project 2012 International Open Government Data Conference 

    Read the article

  • Master Data Management and Cloud Computing

    - by david.butler(at)oracle.com
    Cloud Computing is all the rage these days. There are many reasons why this is so. But like its predecessor, Service Oriented Architecture, it can fall on hard times if the underlying data is left unmanaged. Master Data Management is the perfect Cloud companion. It can materially increase the chances for successful Cloud initiatives. In this blog, I'll review the nature of the Cloud and show how MDM fits in.   Here's the National Institute of Standards and Technology Cloud definition: •          Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources that can be rapidly provisioned and released with minimal management effort or service provider interaction.   Cloud architectures have three main layers: applications or Software as a Service (SaaS), Platforms as a Service (PaaS), and Infrastructure as a Service (IaaS). SaaS generally refers to applications that are delivered to end-users over the Internet. Oracle CRM On Demand is an example of a SaaS application. Today there are hundreds of SaaS providers covering a wide variety of applications including Salesforce.com, Workday, and Netsuite. Oracle MDM applications are located in this layer of Oracle's On Demand enterprise Cloud platform. We call it Master Data as a Service (MDaaS). PaaS generally refers to an application deployment platform delivered as a service. They are often built on a grid computing architecture and include database and middleware. Oracle Fusion Middleware is in this category and includes the SOA and Data Integration products used to connect SaaS applications including MDM. Finally, IaaS generally refers to computing hardware (servers, storage and network) delivered as a service.  This typically includes the associated software as well: operating systems, virtualization, clustering, etc.    Cloud Computing benefits are compelling for a large number of organizations. These include significant cost savings, increased flexibility, and fast deployments. Cost advantages include paying for just what you use. This is especially critical for organizations with variable or seasonal usage. Companies don't have to invest to support peak computing periods. Costs are also more predictable and controllable. Increased agility includes access to the latest technology and experts without making significant up front investments.   While Cloud Computing is certainly very alluring with a clear value proposition, it is not without its challenges. An IDC survey of 244 IT executives/CIOs and their line-of-business (LOB) colleagues identified a number of issues:   Security - 74% identified security as an issue involving data privacy and resource access control. Integration - 61% found that it is hard to integrate Cloud Apps with in-house applications. Operational Costs - 50% are worried that On Demand will actually cost more given the impact of poor data quality on the rest of the enterprise. Compliance - 49% felt that compliance with required regulatory, legal and general industry requirements (such as PCI, HIPAA and Sarbanes-Oxley) would be a major issue. When control is lost, the ability of a provider to directly manage how and where data is deployed, used and destroyed is negatively impacted.  There are others, but I singled out these four top issues because Master Data Management, properly incorporated into a Cloud Computing infrastructure, can significantly ameliorate all of these problems. Cloud Computing can literally rain raw data across the enterprise.   According to fellow blogger, Mike Ferguson, "the fracturing of data caused by the adoption of cloud computing raises the importance of MDM in keeping disparate data synchronized."   David Linthicum, CTO Blue Mountain Labs blogs that "the lack of MDM will become more of an issue as cloud computing rises. We're moving from complex federated on-premise systems, to complex federated on-premise and cloud-delivered systems."    Left unmanaged, non-standard, inconsistent, ungoverned data with questionable quality can pollute analytical systems, increase operational costs, and reduce the ROI in Cloud and On-Premise applications. As cloud computing becomes more relevant, and more data, applications, services, and processes are moved out to cloud computing platforms, the need for MDM becomes ever more important. Oracle's MDM suite is designed to deal with all four of the above Cloud issues listed in the IDC survey.   Security - MDM manages all master data attribute privacy and resource access control issues. Integration - MDM pre-integrates Cloud Apps with each other and with On Premise applications at the data level. Operational Costs - MDM significantly reduces operational costs by increasing data quality, thereby improving enterprise business processes efficiency. Compliance - MDM, with its built in Data Governance capabilities, insures that the data is governed according to organizational standards. This facilitates rapid and accurate reporting for compliance purposes. Oracle MDM creates governed high quality master data. A unified cleansed and standardized data view is produced. The Oracle Customer Hub creates a single view of the customer. The Oracle Product Hub creates high quality product data designed to support all go-to-market processes. Oracle Supplier Hub dramatically reduces the chances of 'supplier exceptions'. Oracle Site Hub masters locations. And Oracle Hyperion Data Relationship Management masters financial reference data and manages enterprise hierarchies across operational areas from ERP to EPM and CRM to SCM. Oracle Fusion Middleware connects Cloud and On Premise applications to MDM Hubs and brings high quality master data to your enterprise business processes.   An independent analyst once said "Poor data quality is like dirt on the windshield. You may be able to drive for a long time with slowly degrading vision, but at some point, you either have to stop and clear the windshield or risk everything."  Cloud Computing has the potential to significantly degrade data quality across the enterprise over time. Deploying a Master Data Management solution prior to or in conjunction with a move to the Cloud can insure that the data flowing into the enterprise from the Cloud is clean and governed. This will in turn insure that expected returns on the investment in Cloud Computing will be realized.       Oracle MDM has proven its metal in this area and has the customers to back that up. In fact, I will be hosting a webcast on Tuesday, April 10th at 10 am PT with one of our top Cloud customers, the Church Pension Group. They have moved all mainline applications to a hosted model and use Oracle MDM to insure the master data is managed and cleansed before it is propagated to other cloud and internal systems. I invite you join Martin Hossfeld, VP, IT Operations, and Danette Patterson, Enterprise Data Manager as they review business drivers for MDM and hosted applications, how they did it, the benefits achieved, and lessons learned. You can register for this free webcast here.  Hope to see you there.

    Read the article

  • Can anyone explain to me what problem Core Data solves?

    - by Curtis Sumpter
    Core Data seems to add a needless layer of complexity. If you want to save data created natively by the user in an app why not just use an object and then write the data all to SQLite or back to a server using a RESTful script if necessary. Android doesn't have Core Data (though if it has something similar I haven't seen it.). What the heck is the point of buggy CD except useless needless overhead for people who can't write SQL or CGI scripts?

    Read the article

  • Difference between "Data Binding'","Data Hiding","Data Wraping" and "Encapsulation"?

    - by krishna Chandra
    I have been studying the conpects of Object oriented programming. Still I am not able to distinguish between the following concepts of object oriented programming.. a) Data Binding b) Data Hiding c) Data Wrapping d) encapsulation e) Data Abstraction I have gone through a lot of books ,and I also search the difference in google. but still I am not able to make the difference between these? Could anyone please help me ?

    Read the article

  • Recover harddrive data

    - by gameshints
    I have a dell laptop that recently "died" (It would get the blue screen of death upon starting) and the hard drive would make a weird cyclic clicking noises. I wanted to see if I could use some tools on my linux machine to recover the data, so I plugged it into there. If I run "fdisk" I get: Disk /dev/sdb: 20.0 GB, 20003880960 bytes 64 heads, 32 sectors/track, 19077 cylinders Units = cylinders of 2048 * 512 = 1048576 bytes Disk identifier: 0x64651a0a Disk /dev/sdb doesn't contain a valid partition table Fine, the partition table is messed up. However if I run "testdisk" in attempt to fix the table, it freezes at this point, making the same cyclical clicking noises: Disk /dev/sdb - 20 GB / 18 GiB - CHS 19078 64 32 Analyse cylinder 158/19077: 00% I don't really care about the hard drive working again, and just the data, so I ran "gpart" to figure out where the partitions used to be. I got this: dev(/dev/sdb) mss(512) chs(19077/64/32)(LBA) #s(39069696) size(19077mb) * Warning: strange partition table magic 0x2A55. Primary partition(1) type: 222(0xDE)(UNKNOWN) size: 15mb #s(31429) s(63-31491) chs: (0/1/1)-(3/126/63)d (0/1/32)-(15/24/4)r hex: 00 01 01 00 DE 7E 3F 03 3F 00 00 00 C5 7A 00 00 Primary partition(2) type: 007(0x07)(OS/2 HPFS, NTFS, QNX or Advanced UNIX) (BOOT) size: 19021mb #s(38956987) s(31492-38988478) chs: (4/0/1)-(895/126/63)d (15/24/5)-(19037/21/31)r hex: 80 00 01 04 07 7E FF 7F 04 7B 00 00 BB 6F 52 02 So I tried to mount just to the old NTFS partition, but got an error: sudo mount -o loop,ro,offset=16123904 -t ntfs /dev/sdb /mnt/usb NTFS signature is missing. Ugh. Okay. But then I tried to get a raw data dump by running dd if=/dev/sdb of=/home/erik/brokenhd skip=31492 count=38956987 But the file got up to 59885568 bytes, and made the same cyclical clicking noises. Obviously there is a bad sector, but I don't know what to do about it! The data is still there... if I view that 57MB file in textpad... I can see raw data from files. How can I get my data back? Thanks for any suggestions, Solution: I was able to recover about 90% of my data: Froze harddrive in freezer Used Ddrescue to make a copy of the drive Since Ddrescue wasn't able to get enough of my drive to use testdisk to recover my partitions/file system, I ended up using photorec to recover most of my files

    Read the article

  • Webbased data modelling and management tool

    - by pixeldude
    Is there a web-based tool available, where I am able to... ...define data models (like in a database admin tool) ...fill in data (in custom web forms, not too generic) with basic features like completion ...import data from CSV oder Excel Sheets ...export data to CSV or SQL ...create snapshots of my data models (versions, diff, etc.) ...share my data models ...discuss/collaborate with other people about my data models Well, I can develop something like this in PHP or with Ruby or whatever. But this is such a common task, where the application support could be a lot better. And it would be language and database independent. This would help to maintain data models in different versions and you can maybe share your data models with others, extend it with your team members, etc. There is a website called FreeBase, which allows you to define a data entity model and fill in data, which also has export features, but I need to define my own data model with my own granularity and structure. And it should not be shared in public if I don't want to. How do you solve problems like this yourself?

    Read the article

  • Advanced Data Source Engine coming to Telerik Reporting Q1 2010

    This is the final blog post from the pre-release series. In it we are going to share with you some of the updates coming to our reporting solution in Q1 2010. A new Declarative Data Source Engine will be added to Telerik Reporting, that will allow full control over data management, and deliver significant gains in rendering performance and memory consumption. Some of the engines new features will be: Data source parameters - those parameters will be used to limit data retrieved from the data source to just the data needed for the report. Data source parameters are processed on the data source side, however only queried data is fetched to the reporting engine, rather than the full data source. This leads to lower memory consumption, because data operations are performed on queried data only, rather than on all data. As a result, only the queried data needs to be stored in the memory vs. the whole dataset, which was the case with the old approach Support for stored procedures - they will assist in achieving a consistent implementation of logic across applications, and are especially practical for performing repetitive tasks. A stored procedure stores the SQL statements and logic, which can then be executed in different reports and/or applications. Stored Procedures will not only save development time, but they will also improve performance, because each stored procedure is compiled on the data base server once, and then is reutilized. In Telerik Reporting, the stored procedure will also be parameterized, where elements of the SQL statement will be bound to parameters. These parameterized SQL queries will be handled through the data source parameters, and are evaluated at run time. Using parameterized SQL queries will improve the performance and decrease the memory footprint of your application, because they will be applied directly on the database server and only the necessary data will be downloaded on the middle tier or client machine; Calculated fields through expressions - with the help of the new reporting engine you will be able to use field values in formulas to come up with a calculated field. A calculated field is a user defined field that is computed "on the fly" and does not exist in the data source, but can perform calculations using the data of the data source object it belongs to. Calculated fields are very handy for adding frequently used formulas to your reports; Improved performance and optimized in-memory OLAP engine - the new data source will come with several improvements in how aggregates are calculated, and memory is managed. As a result, you may experience between 30% (for simpler reports) and 400% (for calculation-intensive reports) in rendering performance, and about 50% decrease in memory consumption. Full design time support through wizards - Declarative data sources are a great advance and will save developers countless hours of coding. In Q1 2010, and true to Telerik Reportings essence, using the new data source engine and its features requires little to no coding, because we have extended most of the wizards to support the new functionality. The newly extended wizards are available in VS2005/VS2008/VS2010 design-time. More features will be revealed on the product's what's new page when the new version is officially released in a few days. Also make sure you attend the free webinar on Thursday, March 11th that will be dedicated to the updates in Telerik Reporting Q1 2010. Did you know that DotNetSlackers also publishes .net articles written by top known .net Authors? We already have over 80 articles in several categories including Silverlight. Take a look: here.

    Read the article

  • BizTalk Cross Reference Data Management Strategy

    - by charlie.mott
    Article Source: http://geekswithblogs.net/charliemott This article describes an approach to the management of cross reference data for BizTalk.  Some articles about the BizTalk Cross Referencing features can be found here: http://home.comcast.net/~sdwoodgate/xrefseed.zip http://geekswithblogs.net/michaelstephenson/archive/2006/12/24/101995.aspx http://geekswithblogs.net/charliemott/archive/2009/04/20/value-vs.id-cross-referencing-in-biztalk.aspx Options Current options to managing this data include: Maintaining xml files in the format that can be used by the out-of-the-box BTSXRefImport.exe utility. Use of user interfaces that have been developed to manage this data: BizTalk Cross Referencing Tool XRef XML Creation Tool However, there are the following issues with the above options: The 'BizTalk Cross Referencing Tool' requires a separate database to manage.  The 'XRef XML Creation' tool has no means of persisting the data settings. The 'BizTalk Cross Referencing tool' generates integers in the common id field. I prefer to use a string (e.g. acme.country.uk). This is more readable. (see naming conventions below). Both UI tools continue to use BTSXRefImport.exe.  This utility replaces all xref data. This can be a problem in continuous integration environments that support multiple clients or BizTalk target instances.  If you upload the data for one client it would destroy the data for another client.  Yet in TFS where builds run concurrently, this would break unit tests. Alternative Approach In response to these issues, I instead use simple SQL scripts to directly populate the BizTalkMgmtDb xref tables combined with a data namepacing strategy to isolate client data. Naming Conventions All data keys use namespace prefixing.  The pattern will be <companyName>.<data Type>.  The naming conventions will be to use lower casing for all items.  The data must follow this pattern to isolate it from other company cross-reference data.  The table below shows some sample data. (Note: this data uses the 'ID' cross-reference tables.  the same principles apply for the 'value' cross-referencing tables). Table.Field Description Sample Data xref_AppType.appType Application Types acme.erp acme.portal acme.assetmanagement xref_AppInstance.appInstance Application Instances (each will have a corresponding application type). acme.dynamics.ax acme.dynamics.crm acme.sharepoint acme.maximo xref_IDXRef.idXRef Holds the cross reference data types. acme.taxcode acme.country xref_IDXRefData.CommonID Holds each cross reference type value used by the canonical schemas. acme.vatcode.exmpt acme.vatcode.std acme.country.usa acme.country.uk xref_IDXRefData.AppID This holds the value for each application instance and each xref type. GBP USD SQL Scripts The data to be stored in the BizTalkMgmtDb xref tables will be managed by SQL scripts stored in a database project in the visual studio solution. File(s) Description Build.cmd A sqlcmd script to deploy data by running the SQL scripts below.  (This can be run as part of the MSBuild process).   acme.purgexref.sql SQL script to clear acme.* data from the xref tables.  As such, this will not impact data for any other company. acme.applicationInstances.sql   SQL script to insert application type and application instance data.   acme.vatcode.sql acme.country.sql etc ...  There will be a separate SQL script to insert each cross-reference data type and application specific values for these types.

    Read the article

  • Master Note for Generic Data Warehousing

    - by lajos.varady(at)oracle.com
    ++++++++++++++++++++++++++++++++++++++++++++++++++++ The complete and the most recent version of this article can be viewed from My Oracle Support Knowledge Section. Master Note for Generic Data Warehousing [ID 1269175.1] ++++++++++++++++++++++++++++++++++++++++++++++++++++In this Document   Purpose   Master Note for Generic Data Warehousing      Components covered      Oracle Database Data Warehousing specific documents for recent versions      Technology Network Product Homes      Master Notes available in My Oracle Support      White Papers      Technical Presentations Platforms: 1-914CU; This document is being delivered to you via Oracle Support's Rapid Visibility (RaV) process and therefore has not been subject to an independent technical review. Applies to: Oracle Server - Enterprise Edition - Version: 9.2.0.1 to 11.2.0.2 - Release: 9.2 to 11.2Information in this document applies to any platform. Purpose Provide navigation path Master Note for Generic Data Warehousing Components covered Read Only Materialized ViewsQuery RewriteDatabase Object PartitioningParallel Execution and Parallel QueryDatabase CompressionTransportable TablespacesOracle Online Analytical Processing (OLAP)Oracle Data MiningOracle Database Data Warehousing specific documents for recent versions 11g Release 2 (11.2)11g Release 1 (11.1)10g Release 2 (10.2)10g Release 1 (10.1)9i Release 2 (9.2)9i Release 1 (9.0)Technology Network Product HomesOracle Partitioning Advanced CompressionOracle Data MiningOracle OLAPMaster Notes available in My Oracle SupportThese technical articles have been written by Oracle Support Engineers to provide proactive and top level information and knowledge about the components of thedatabase we handle under the "Database Datawarehousing".Note 1166564.1 Master Note: Transportable Tablespaces (TTS) -- Common Questions and IssuesNote 1087507.1 Master Note for MVIEW 'ORA-' error diagnosis. For Materialized View CREATE or REFRESHNote 1102801.1 Master Note: How to Get a 10046 trace for a Parallel QueryNote 1097154.1 Master Note Parallel Execution Wait Events Note 1107593.1 Master Note for the Oracle OLAP OptionNote 1087643.1 Master Note for Oracle Data MiningNote 1215173.1 Master Note for Query RewriteNote 1223705.1 Master Note for OLTP Compression Note 1269175.1 Master Note for Generic Data WarehousingWhite Papers Transportable Tablespaces white papers Database Upgrade Using Transportable Tablespaces:Oracle Database 11g Release 1 (February 2009) Platform Migration Using Transportable Database Oracle Database 11g and 10g Release 2 (August 2008) Database Upgrade using Transportable Tablespaces: Oracle Database 10g Release 2 (April 2007) Platform Migration using Transportable Tablespaces: Oracle Database 10g Release 2 (April 2007)Parallel Execution and Parallel Query white papers Best Practices for Workload Management of a Data Warehouse on the Sun Oracle Database Machine (June 2010) Effective resource utilization by In-Memory Parallel Execution in Oracle Real Application Clusters 11g Release 2 (Feb 2010) Parallel Execution Fundamentals in Oracle Database 11g Release 2 (November 2009) Parallel Execution with Oracle Database 10g Release 2 (June 2005)Oracle Data Mining white paper Oracle Data Mining 11g Release 2 (March 2010)Partitioning white papers Partitioning with Oracle Database 11g Release 2 (September 2009) Partitioning in Oracle Database 11g (June 2007)Materialized Views and Query Rewrite white papers Oracle Materialized Views  and Query Rewrite (May 2005) Improving Performance using Query Rewrite in Oracle Database 10g (December 2003)Database Compression white papers Advanced Compression with Oracle Database 11g Release 2 (September 2009) Table Compression in Oracle Database 10g Release 2 (May 2005)Oracle OLAP white papers On-line Analytic Processing with Oracle Database 11g Release 2 (September 2009) Using Oracle Business Intelligence Enterprise Edition with the OLAP Option to Oracle Database 11g (July 2008)Generic white papers Enabling Pervasive BI through a Practical Data Warehouse Reference Architecture (February 2010) Optimizing and Protecting Storage with Oracle Database 11g Release 2 (November 2009) Oracle Database 11g for Data Warehousing and Business Intelligence (August 2009) Best practices for a Data Warehouse on Oracle Database 11g (September 2008)Technical PresentationsA selection of ObE - Oracle by Examples documents: Generic Using Basic Database Functionality for Data Warehousing (10g) Partitioning Manipulating Partitions in Oracle Database (11g Release 1) Using High-Speed Data Loading and Rolling Window Operations with Partitioning (11g Release 1) Using Partitioned Outer Join to Fill Gaps in Sparse Data (10g) Materialized View and Query Rewrite Using Materialized Views and Query Rewrite Capabilities (10g) Using the SQLAccess Advisor to Recommend Materialized Views and Indexes (10g) Oracle OLAP Using Microsoft Excel With Oracle 11g Cubes (how to analyze data in Oracle OLAP Cubes using Excel's native capabilities) Using Oracle OLAP 11g With Oracle BI Enterprise Edition (Creating OBIEE Metadata for OLAP 11g Cubes and querying those in BI Answers) Building OLAP 11g Cubes Querying OLAP 11g Cubes Creating Interactive APEX Reports Over OLAP 11g CubesSelection of presentations from the BIWA website:Extreme Data Warehousing With Exadata  by Hermann Baer (July 2010) (slides 2.5MB, recording 54MB)Data Mining Made Easy! Introducing Oracle Data Miner 11g Release 2 New "Work flow" GUI   by Charlie Berger (May 2010) (slides 4.8MB, recording 85MB )Best Practices for Deploying a Data Warehouse on Oracle Database 11g  by Maria Colgan (December 2009)  (slides 3MB, recording 18MB, white paper 3MB )

    Read the article

  • How to give my user permission to add/edit files on local apache server? [duplicate]

    - by Logan
    Possible Duplicate: How to make Apache run as current user I'm setting up my local test server again, and I seem to have forgotten how to successfully set up the LAMP server. I have installed LAMP server via tasksel command and I have configured the /var/www directory according to a guide I've found: After the lamp server installation you will need write permissions to the /var/www directory. Follow these steps to configure permissions. Add your user to the www-data group sudo usermod -a -G www-data <your user name> now add the /var/www folder to the www-data group sudo chgrp -R www-data /var/www now give write permissions to the www-data group sudo chmod -R g+w /var/www So logan user is now part of www-data group and the file/folder permissions look like the output below: logan@computer:/var/www$ ls -lart total 172 -rw-r--r-- 1 www-data www-data 1997 Oct 23 2010 wp-links-opml.php -rw-r--r-- 1 www-data www-data 3177 Nov 1 2010 wp-config-sample.php -rw-r--r-- 1 www-data www-data 3700 Jan 8 2012 wp-trackback.php -rw-r--r-- 1 www-data www-data 271 Jan 8 2012 wp-blog-header.php -rw-r--r-- 1 www-data www-data 395 Jan 8 2012 index.php -rw-r--r-- 1 www-data www-data 3522 Apr 10 2012 wp-comments-post.php -rw-r--r-- 1 www-data www-data 19929 May 6 2012 license.txt -rw-r--r-- 1 www-data www-data 18219 Sep 11 08:27 wp-signup.php -rw-r--r-- 1 www-data www-data 2719 Sep 11 16:11 xmlrpc.php -rw-r--r-- 1 www-data www-data 2718 Sep 23 12:57 wp-cron.php -rw-r--r-- 1 www-data www-data 7723 Sep 25 01:26 wp-mail.php -rw-r--r-- 1 www-data www-data 2408 Oct 26 15:40 wp-load.php -rw-r--r-- 1 www-data www-data 4663 Nov 17 10:11 wp-activate.php -rw-r--r-- 1 www-data www-data 9899 Nov 22 04:52 wp-settings.php -rw-r--r-- 1 www-data www-data 9175 Nov 29 19:57 readme.html -rw-r--r-- 1 www-data www-data 29310 Nov 30 08:40 wp-login.php drwxr-xr-x 14 root root 4096 Dec 24 17:41 .. drwx------ 9 www-data www-data 4096 Dec 26 16:11 wp-admin drwx------ 9 www-data www-data 4096 Dec 26 16:11 wp-includes -rw-rw-rw- 1 www-data www-data 3448 Dec 26 16:14 wp-config.php drwxrwxr-x 5 www-data www-data 4096 Dec 26 16:14 . drwx------ 6 www-data www-data 4096 Dec 26 16:19 wp-content Things work perfectly at http://localhost, I can view the website fine. The thing with this is that I will be working on a plugin for wordpress and I don't want to deal with separate owners under www directory to create or modify files/folders. When I give my user the ownership of /var/www recursively as logan:www-data I can create/modify files but cannot view the http://localhost. I get a Forbidden error. I'm assuming that this is because of the Apache's configuration? Which one is healthier or easier considering this is just a local test website, configuring apache to give user logan to view website and chmod /var/www logan:logan so that I can create files etc. without any sudo commands; or is it easier to configure user groups to get www-data user to act like my logan user? (Idk how that's possible, maybe putting www-data user under logan group?) Please shed some light to this subject. All I want is to be able to create/modifiy files under my user, and yet to be able to successfully view http://localhost I appreciate the help!

    Read the article

  • Oracle Enterprise Data Quality: Ever Integration-ready

    - by Mala Narasimharajan
    It is closing in on a year now since Oracle’s acquisition of Datanomic, and the addition of Oracle Enterprise Data Quality (EDQ) to the Oracle software family. The big move has caused some big shifts in emphasis and some very encouraging excitement from the field.  To give an illustration, combined with a shameless promotion of how EDQ can help to give quick insights into your data, I did a quick Phrase Profile of the subject field of emails to the Global EDQ mailing list since it was set up last September. The results revealed a very clear theme:   Integration, Integration, Integration! As well as the important Siebel and Oracle Data Integrator (ODI) integrations, we have been asked about integration with a huge variety of Oracle applications, including EBS, Peoplesoft, CRM on Demand, Fusion, DRM, Endeca, RightNow, and more - and we have not stood still! While it would not have been possible to develop specific pre-integrations with all of the above within a year, we have developed a package of feature-rich out-of-the-box web services and batch processes that can be plugged into any application or middleware technology with ease. And with Siebel, they work out of the box. Oracle Enterprise Data Quality version 9.0.4 includes the Customer Data Services (CDS) pack – a ready set of standard processes with standard interfaces, to provide integrated: Address verification and cleansing  Individual matching Organization matching The services can are suitable for either Batch or Real-Time processing, and are enabled for international data, with simple configuration options driving the set of locale-specific dictionaries that are used. For example, large dictionaries are provided to support international name transcription and variant matching, including highly specialized handling for Arabic, Japanese, Chinese and Korean data. In total across all locales, CDS includes well over a million dictionary entries.   Excerpt from EDQ’s CDS Individual Name Standardization Dictionary CDS has been developed to replace the OEM of Informatica Identity Resolution (IIR) for attached Data Quality on the Oracle price list, but does this in a way that creates a ‘best of both worlds’ situation for customers, who can harness not only the out-of-the-box functionality of pre-packaged matching and standardization services, but also the flexibility of OEDQ if they want to customize the interfaces or the process logic, without having to learn more than one product. From a competitive point of view, we believe this stands us in good stead against our key competitors, including Informatica, who have separate ‘Identity Resolution’ and general DQ products, and IBM, who provide limited out-of-the-box capabilities (with a steep learning curve) in both their QualityStage data quality and Initiate matching products. Here is a brief guide to the main services provided in the pack: Address Verification and Standardization EDQ’s CDS Address Cleaning Process The Address Verification and Standardization service uses EDQ Address Verification (an OEM of Loqate software) to verify and clean addresses in either real-time or batch. The Address Verification processor is wrapped in an EDQ process – this adds significant capabilities over calling the underlying Address Verification API directly, specifically: Country-specific thresholds to determine when to accept the verification result (and therefore to change the input address) based on the confidence level of the API Optimization of address verification by pre-standardizing data where required Formatting of output addresses into the input address fields normally used by applications Adding descriptions of the address verification and geocoding return codes The process can then be used to provide real-time and batch address cleansing in any application; such as a simple web page calling address cleaning and geocoding as part of a check on individual data.     Duplicate Prevention Unlike Informatica Identity Resolution (IIR), EDQ uses stateless services for duplicate prevention to avoid issues caused by complex replication and synchronization of large volume customer data. When a record is added or updated in an application, the EDQ Cluster Key Generation service is called, and returns a number of key values. These are used to select other records (‘candidates’) that may match in the application data (which has been pre-seeded with keys using the same service). The ‘driving record’ (the new or updated record) is then presented along with all selected candidates to the EDQ Matching Service, which decides which of the candidates are a good match with the driving record, and scores them according to the strength of match. In this model, complex multi-locale EDQ techniques can be used to generate the keys and ensure that the right balance between performance and matching effectiveness is maintained, while ensuring that the application retains control of data integrity and transactional commits. The process is explained below: EDQ Duplicate Prevention Architecture Note that where the integration is with a hub, there may be an additional call to the Cluster Key Generation service if the master record has changed due to merges with other records (and therefore needs to have new key values generated before commit). Batch Matching In order to allow customers to use different match rules in batch to real-time, separate matching templates are provided for batch matching. For example, some customers want to minimize intervention in key user flows (such as adding new customers) in front end applications, but to conduct a more exhaustive match on a regular basis in the back office. The batch matching jobs are also used when migrating data between systems, and in this case normally a more precise (and automated) type of matching is required, in order to minimize the review work performed by Data Stewards.  In batch matching, data is captured into EDQ using its standard interfaces, and records are standardized, clustered and matched in an EDQ job before matches are written out. As with all EDQ jobs, batch matching may be called from Oracle Data Integrator (ODI) if required. When working with Siebel CRM (or master data in Siebel UCM), Siebel’s Data Quality Manager is used to instigate batch jobs, and a shared staging database is used to write records for matching and to consume match results. The CDS batch matching processes automatically adjust to Siebel’s ‘Full Match’ (match all records against each other) and ‘Incremental Match’ (match a subset of records against all of their selected candidates) modes. The Future The Customer Data Services Pack is an important part of the Oracle strategy for EDQ, offering a clear path to making Data Quality Assurance an integral part of enterprise applications, and providing a strong value proposition for adopting EDQ. We are planning various additions and improvements, including: An out-of-the-box Data Quality Dashboard Even more comprehensive international data handling Address search (suggesting multiple results) Integrated address matching The EDQ Customer Data Services Pack is part of the Enterprise Data Quality Media Pack, available for download at http://www.oracle.com/technetwork/middleware/oedq/downloads/index.html.

    Read the article

  • Master Data Management Implementation Styles

    - by david.butler(at)oracle.com
    In any Master Data Management solution deployment, one of the key decisions to be made is the choice of the MDM architecture. Gartner and other analysts describe some different Hub deployment styles, which must be supported by a best of breed MDM solution in order to guarantee the success of the deployment project.   Registry Style: In a Registry Style MDM Hub, the various source systems publish their data and a subscribing Hub stores only the source system IDs, the Foreign Keys (record IDs on source systems) and the key data values needed for matching. The Hub runs the cleansing and matching algorithms and assigns unique global identifiers to the matched records, but does not send any data back to the source systems. The Registry Style MDM Hub uses data federation capabilities to build the "virtual" golden view of the master entity from the connected systems.   Consolidation Style: The Consolidation Style MDM Hub has a physically instantiated, "golden" record stored in the central Hub. The authoring of the data remains distributed across the spoke systems and the master data can be updated based on events, but is not guaranteed to be up to date. The master data in this case is usually not used for transactions, but rather supports reporting; however, it can also be used for reference operationally.   Coexistence Style: The Coexistence Style MDM Hub involves master data that's authored and stored in numerous spoke systems, but includes a physically instantiated golden record in the central Hub and harmonized master data across the application portfolio. The golden record is constructed in the same manner as in the consolidation style, and, in the operational world, Consolidation Style MDM Hubs often evolve into the Coexistence Style. The key difference is that in this architectural style the master data stored in the central MDM system is selectively published out to the subscribing spoke systems.   Transaction Style: In this architecture, the Hub stores, enhances and maintains all the relevant (master) data attributes. It becomes the authoritative source of truth and publishes this valuable information back to the respective source systems. The Hub publishes and writes back the various data elements to the source systems after the linking, cleansing, matching and enriching algorithms have done their work. Upstream, transactional applications can read master data from the MDM Hub, and, potentially, all spoke systems subscribe to updates published from the central system in a form of harmonization. The Hub needs to support merging of master records. Security and visibility policies at the data attribute level need to be supported by the Transaction Style hub, as well.   Adaptive Transaction Style: This is similar to the Transaction Style, but additionally provides the capability to respond to diverse information and process requests across the enterprise. This style emerged most recently to address the limitations of the above approaches. With the Adaptive Transaction Style, the Hub is built as a platform for consolidating data from disparate third party and internal sources and for serving unified master entity views to operational applications, analytical systems or both. This approach delivers a real-time Hub that has a reliable, persistent foundation of master reference and relationship data, along with all the history and lineage of data changes needed for audit and compliance tracking. On top of this persistent master data foundation, the Hub can dynamically aggregate transaction data on demand from different source systems to deliver the unified golden view to downstream systems. Data can also be accessed through batch interfaces, published to a message bus or served through a real-time services layer. New data sources can be readily added in this approach by extending the data model and by configuring the new source mappings and the survivorship rules, meaning that all legacy data hubs can be leveraged to contribute their records/rules into the new transaction hub. Finally, through rich user interfaces for data stewardship, it allows exception handling by business analysts to keep it current with business rules/practices while maintaining the reliability of best-of-breed master records.   Confederation Style: In this architectural style, several Hubs are maintained at departmental and/or agency and/or territorial level, and each of them are connected to the other Hubs either directly or via a central Super-Hub. Each Domain level Hub can be implemented using any of the previously described styles, but normally the Central Super-Hub is a Registry Style one. This is particularly important for Public Sector organizations, where most of the time it is practically or legally impossible to store in a single central hub all the relevant constituent information from all departments.   Oracle MDM Solutions can be deployed according to any of the above MDM architectural styles, and have been specifically designed to fully support the Transaction and Adaptive Transaction styles. Oracle MDM Solutions provide strong data federation and integration capabilities which are key to enabling the use of the Confederated Hub as a possible architectural style approach. Don't lock yourself into a solution that cannot evolve with your needs. With Oracle's support for any type of deployment architecture, its ability to leverage the outstanding capabilities of the Oracle technology stack, and its open interfaces for non-Oracle technology stacks, Oracle MDM Solutions provide a low TCO and a quick ROI by enabling a phased implementation strategy.

    Read the article

  • Best practices for upgrading user data when updating versions of software

    - by Javy
    In my code I check the current version of the software on launch and compare it to the version stored in the user's data file(s). If the version is newer, then I call different methods to update the old data to the newer data version, if necessary. I usually have to make a new method to convert the data with each update that changes user data in some way, and cannot remove the old ones in case there was someone who missed an update. So the app must be able to go through each method call and update their data until they get their data current. With larger data sets, this could be a problem. In addition, I recently had a brief discussion with another StackOverflow user this and he indicated he always appended a date stamp to the filename to manage data versions, although his reasoning as to why this was better than storing the version data in the file itself was unclear. Since I've rarely seen management of user data versions in books I've read, I'm curious what are the best practices for naming user data files and procedures for updating older data to newer versions.

    Read the article

  • How to tackle archived who-is personal data with opt-out?

    - by defaye
    As far as I understand it, it is possible to opt-out (in the UK at least) of having your address details displayed on who-is information of a domain for non-trading individuals. What I want to know is, after opt-out, how do individuals combat archived data? Is there any enforcement of this? How many who-is websites are there which archive data and what rights do we have to force them to remove that data without paying absurd fees? In the case of capitulating to these scoundrels, what point is it in paying for the removal of archived data if that data can presumably resurface on another who-is repository? In other words, what strategy is one supposed to take, besides being wiser after the fact?

    Read the article

  • How do I cluster strings based on a relation between two strings?

    - by Tom Wijsman
    If you don't know WEKA, you can try a theoretical answer. I don't need literal code/examples... I have a huge data set of strings in which I want to cluster the strings to find the most related ones, these could as well be seen as duplicate. I already have a set of couples of string for which I know that they are duplicate to each other, so, now I want to do some data mining on those two sets. The result I'm looking for is a system that would return me the possible most relevant couples of strings for which we don't know yet that they are duplicates, I believe that I need clustering for this, which type? Note that I want to base myself on word occurrence comparison, not on interpretation or meaning. Here is an example of two string of which we know they are duplicate (in our vision on them): The weather is really cold and it is raining. It is raining and the weather is really cold. Now, the following strings also exist (most to least relevant, ignoring stop words): Is the weather really that cold today? Rainy days are awful. I see the sunshine outside. The software would return the following two strings as most relevant, which aren't known to be duplicate: The weather is really cold and it is raining. Is the weather really that cold today? Then, I would mark that as duplicate or not duplicate and it would present me with another couple. How do I go to implement this in the most efficient way that I can apply to a large data set?

    Read the article

  • pure-ftpd debian, can't get www-data user working

    - by lynks
    I'm trying to add FTP access to the apache web files, in the past I have done this with an ftpuser and group arrangement. This time I would like to make it possible to login directly as www-data (the default apache user on debian) to make things a bit cleaner. I have checked and re-checked all the common issues; MinUID is set to 1 (www-data has uid 33) www-data has shell set to /bin/bash in /etc/passwd PAMAuthentication is off UnixAuthentication is on I have restarted pure-ftpd using /etc/init.d/pure-ftpd restart My resulting pure-ftpd run is; /usr/sbin/pure-ftpd -l unix -A -Y 1 -u 1 -E -O clf:/var/log/pure-ftpd/transfer.log -8 UTF-8 -B My syslog contains; Oct 7 19:46:40 Debian-60-squeeze-64 pure-ftpd: ([email protected]) [WARNING] Can't login as [www-data]: account disabled And my ftp client is giving me; 530 Sorry, but I can't trust you Am I missing something obvious?

    Read the article

  • Chef: nested data bag data to template file returns "can't convert String into Integer"

    - by Dalho Park
    I'm creating simple test recipe with a template and data bag. What I'm trying to do is creating a config file from data bag that has simple nested information, but I receive error "can't convert String into Integer" Here are my setting file 1) recipe/default.rb data1 = data_bag_item( 'mytest', 'qa' )['test'] data2 = data_bag_item( 'mytest', 'qa' ) template "/opt/env/test.cfg" do source "test.erb" action :create_if_missing mode 0664 owner "root" group "root" variables({ :pepe1 = data1['part.name'], :pepe2 = data2['transport.tcp.ip2'] }) end 2)my data bag named "mytest" $knife data bag show mytest qa id: qa test: part.name: L12 transport.tcp.ip: 111.111.111.111 transport.tcp.port: 9199 transport.tcp.ip2: 222.222.222.222 3)template file test.erb part.name=<%= @pepe1 % transport.tcp.binding=<%= @pepe2 % Error reurns when I run chef-client on my server, [2013-06-24T19:50:38+00:00] DEBUG: filtered backtrace of compile error: /var/chef/cache/cookbooks/config_test/recipes/default.rb:19:in []',/var/chef/cache/cookbooks/config_test/recipes/default.rb:19:inblock in from_file',/var/chef/cache/cookbooks/config_test/recipes/default.rb:12:in from_file' [2013-06-24T19:50:38+00:00] DEBUG: filtered backtrace of compile error: /var/chef/cache/cookbooks/config_test/recipes/default.rb:19:in[]',/var/chef/cache/cookbooks/config_test/recipes/default.rb:19:in block in from_file',/var/chef/cache/cookbooks/config_test/recipes/default.rb:12:infrom_file' [2013-06-24T19:50:38+00:00] DEBUG: backtrace entry for compile error: '/var/chef/cache/cookbooks/config_test/recipes/default.rb:19:in `[]'' [2013-06-24T19:50:38+00:00] DEBUG: Line number of compile error: '19' Recipe Compile Error in /var/chef/cache/cookbooks/config_test/recipes/default.rb TypeError can't convert String into Integer Cookbook Trace: /var/chef/cache/cookbooks/config_test/recipes/default.rb:19:in []' /var/chef/cache/cookbooks/config_test/recipes/default.rb:19:inblock in from_file' /var/chef/cache/cookbooks/config_test/recipes/default.rb:12:in `from_file' Relevant File Content: /var/chef/cache/cookbooks/config_test/recipes/default.rb: 12: template "/opt/env/test.cfg" do 13: source "test.erb" 14: action :create_if_missing 15: mode 0664 16: owner "root" 17: group "root" 18: variables({ 19 :pepe1 = data1['part.name'], 20: :pepe2 = data2['transport.tcp.ip2'] 21: }) 22: end 23: I tried many things and if I comment out "pepe1 = data1['part.name'],", then :pepe2 = data2['transport.tcp.ip2'] works fine. only nested data "part.name" cannot be set to @pepe1. Does anyone knows why I receive the errors? thanks,

    Read the article

  • Data recovery on a corrupted 3TB disk

    - by Mark K Cowan
    Short version I probably need software to run a deep-scan recovery (ideally on Linux) to find files on NTFS filesystem. The file data is intact, but the references are no longer present. Analogous to recovering data from a "quick-formatted" partition. Hopefully there is a smarter way available than deep-scan, one which would recover filenames and possibly paths. Long version I have a 3TB disk containing a load of backups. Windows 7 SP1 refused to detect the disk when plugged in directly via SATA, so I put it on a USB/SATA adaptor which seemed to work at first. The SATA/USB adaptor probably does not support disks over 2.2TB though. Windows first asked me if I wanted to 'format' the disk, then later showed me most of the contents but some folder were inaccessible. I stupidly decided to run a CHKDSK on my backup disk, which made the folders accessible but also left them empty. I connected this disk via SATA to my main PC (Arch Linux). I tried: testdisk ntfsundelete ntfsfix --no-action (to look for diagnostically relevant faults, disk was "OK" though) to no avail as the files references in the tables had presumably been zeroed out by CHKDSK, rather than using a typical journal'd deletion). If it is useful at all, a majority of the files that I want to recover are JPEG, Photoshop PSD, and MPEG-3/MPEG-4/AVI/MKV files. If worst comes to worst, I'll just design my own sector scanner and use some simple heuristic-driven analysis to recover raw binary blocks of data from the disk which appears to match the structures of the above file types. I am unfamiliar with the exact workings of NTFS but used to be proficient at recovering FAT32 systems with just a hex-editor, so I can provide any useful diagnostic information if you let me know how to find it! My priorities in ascending order of importance for choosing the accepted answer: Restores directory structure Recovers many filenames in addition to the file data Is free / very cheap Runs on Linux Recovers a majority of file data The last point is the most important, but the more of the higher points you match the more rep you'll probably get :)

    Read the article

< Previous Page | 18 19 20 21 22 23 24 25 26 27 28 29  | Next Page >