data source - Page 288 - Developer IT

Losing data after reading them correct from file

- by user1388172

i have the fallowing class of object with a class a data structure which i use in main combined. The ADT(abstract data type) is a linked list. After i read from file the input data and create and object which at print looks just fine after a print. after i push_back() the 3-rd int variable get initializated to 0. So example and code: Example: ex.in: 1 7 31 2 2 2 3 3 3 now i create objects from each line, which at print look as they suppose, but after push_back(): 1 7 0 2 2 0 3 3 0 Class.h: class RAngle { private: int x,y,l,b; public: int solution,prec; RAngle(){ x = y = solution = prec = b = l =0; } RAngle(int i,int j,int k){ x = i; y = j; l = k; solution = 0; prec=0; b=0; } friend ostream& operator << (ostream& out, const RAngle& ra){ out << ra.x << " " << ra.y << " " << ra.l <<endl; return out; } friend istream& operator >>( istream& is, RAngle& ra){ is >> ra.x; is >> ra.y; is >> ra.l; return is ; } }; ADT.h: template <class T> class List { private: struct Elem { T data; Elem* next; }; Elem* first; T pop_front(){ if (first!=NULL) { T aux = first->data; first = first->next; return aux; } T a; return a; } void push_back(T data){ Elem *n = new Elem; n->data = data; n->next = NULL; if (first == NULL) { first = n; return ; } Elem *current; for(current=first;current->next != NULL;current=current->next); current->next = n; } Main.cpp(after i call this function in main which prints object as they suppose to be the x var(from RAngle class) changes to 0 in all cases.) void readData(List <RAngle> &l){ RAngle r; ifstream f_in; f_in.open("ex.in",ios::in); for(int i=0;i<10;++i){ f_in >> r; cout << r; l.push_back(r); }

Read the article

CoreData: Same predicate (IN) returns different fetched results after a Save operation

- by Jason Lee

I have code below: NSArray *existedTasks = [[TaskBizDB sharedInstance] fetchTasksWatchedByMeOfProject:projectId]; [context save:&error]; existedTasks = [[TaskBizDB sharedInstance] fetchTasksWatchedByMeOfProject:projectId]; NSArray *allTasks = [[TaskBizDB sharedInstance] fetchTasksOfProject:projectId]; First line returns two objects; Second line save the context; Third line returns just one object, which is contained in the 'two objects' above; And the last line returns 6 objects, containing the 'two objects' returned at the first line. The fetch interface works like below: WXModel *model = [WXModel modelWithEntity:NSStringFromClass([WQPKTeamTask class])]; NSPredicate *predicate = [NSPredicate predicateWithFormat:@"(%@ IN personWatchers) AND (projectId == %d)", currentLoginUser, projectId]; [model setPredicate:predicate]; NSArray *fetchedTasks = [model fetch]; if (fetchedTasks.count == 0) return nil; return fetchedTasks; What confused me is that, with the same fetch request, why return different results just after a save? Here comes more detail: The 'two objects' returned at the first line are: <WQPKTeamTask: 0x1b92fcc0> (entity: WQPKTeamTask; id: 0x1b9300f0 <x-coredata://CFFD3F8B-E613-4DE8-85AA-4D6DD08E88C5/WQPKTeamTask/p9> ; data: { projectId = 372004; taskId = 338001; personWatchers = ( "0xf0bf440 <x-coredata://CFFD3F8B-E613-4DE8-85AA-4D6DD08E88C5/WWPerson/p1>" ); } <WQPKTeamTask: 0xf3f6130> (entity: WQPKTeamTask; id: 0xf3cb8d0 <x-coredata://CFFD3F8B-E613-4DE8-85AA-4D6DD08E88C5/WQPKTeamTask/p11> ; data: { projectId = 372004; taskId = 340006; personWatchers = ( "0xf0bf440 <x-coredata://CFFD3F8B-E613-4DE8-85AA-4D6DD08E88C5/WWPerson/p1>" ); } And the only one object returned at third line is: <WQPKTeamTask: 0x1b92fcc0> (entity: WQPKTeamTask; id: 0x1b9300f0 <x-coredata://CFFD3F8B-E613-4DE8-85AA-4D6DD08E88C5/WQPKTeamTask/p9> ; data: { projectId = 372004; taskId = 338001; personWatchers = ( "0xf0bf440 <x-coredata://CFFD3F8B-E613-4DE8-85AA-4D6DD08E88C5/WWPerson/p1>" ); } Printing description of allTasks: <_PFArray 0xf30b9a0>( <WQPKTeamTask: 0xf3ab9d0> (entity: WQPKTeamTask; id: 0xf3cda40 <x-coredata://CFFD3F8B-E613-4DE8-85AA-4D6DD08E88C5/WQPKTeamTask/p6> ; data: <fault>), <WQPKTeamTask: 0xf315720> (entity: WQPKTeamTask; id: 0xf3c23a0 <x-coredata://CFFD3F8B-E613-4DE8-85AA-4D6DD08E88C5/WQPKTeamTask/p7> ; data: <fault>), <WQPKTeamTask: 0xf3a1ed0> (entity: WQPKTeamTask; id: 0xf3cda30 <x-coredata://CFFD3F8B-E613-4DE8-85AA-4D6DD08E88C5/WQPKTeamTask/p8> ; data: <fault>), <WQPKTeamTask: 0x1b92fcc0> (entity: WQPKTeamTask; id: 0x1b9300f0 <x-coredata://CFFD3F8B-E613-4DE8-85AA-4D6DD08E88C5/WQPKTeamTask/p9> ; data: { projectId = 372004; taskId = 338001; personWatchers = ( "0xf0bf440 <x-coredata://CFFD3F8B-E613-4DE8-85AA-4D6DD08E88C5/WWPerson/p1>" ); }), <WQPKTeamTask: 0xf325e50> (entity: WQPKTeamTask; id: 0xf343820 <x-coredata://CFFD3F8B-E613-4DE8-85AA-4D6DD08E88C5/WQPKTeamTask/p10> ; data: <fault>), <WQPKTeamTask: 0xf3f6130> (entity: WQPKTeamTask; id: 0xf3cb8d0 <x-coredata://CFFD3F8B-E613-4DE8-85AA-4D6DD08E88C5/WQPKTeamTask/p11> ; data: { projectId = 372004; taskId = 340006; personWatchers = ( "0xf0bf440 <x-coredata://CFFD3F8B-E613-4DE8-85AA-4D6DD08E88C5/WWPerson/p1>" ); }) ) UPDATE 1 If I call the same interface fetchTasksWatchedByMeOfProject: in: #pragma mark - NSFetchedResultsController Delegate - (void)controllerDidChangeContent:(NSFetchedResultsController *)controller { I will get 'two objects' as well. UPDATE 2 I've tried: NSPredicate *predicate = [NSPredicate predicateWithFormat:@"(ANY personWatchers == %@) AND (projectId == %d)", currentLoginUser, projectId]; NSPredicate *predicate = [NSPredicate predicateWithFormat:@"(ANY personWatchers.personId == %@) AND (projectId == %d)", currentLoginUserId, projectId]; Still the same result. UPDATE 3 I've checked the save:&error, error is nil.

Read the article

Advanced TSQL Tuning: Why Internals Knowledge Matters

- by Paul White

There is much more to query tuning than reducing logical reads and adding covering nonclustered indexes. Query tuning is not complete as soon as the query returns results quickly in the development or test environments. In production, your query will compete for memory, CPU, locks, I/O and other resources on the server. Today’s entry looks at some tuning considerations that are often overlooked, and shows how deep internals knowledge can help you write better TSQL. As always, we’ll need some example data. In fact, we are going to use three tables today, each of which is structured like this: Each table has 50,000 rows made up of an INTEGER id column and a padding column containing 3,999 characters in every row. The only difference between the three tables is in the type of the padding column: the first table uses CHAR(3999), the second uses VARCHAR(MAX), and the third uses the deprecated TEXT type. A script to create a database with the three tables and load the sample data follows: USE master; GO IF DB_ID('SortTest') IS NOT NULL DROP DATABASE SortTest; GO CREATE DATABASE SortTest COLLATE LATIN1_GENERAL_BIN; GO ALTER DATABASE SortTest MODIFY FILE ( NAME = 'SortTest', SIZE = 3GB, MAXSIZE = 3GB ); GO ALTER DATABASE SortTest MODIFY FILE ( NAME = 'SortTest_log', SIZE = 256MB, MAXSIZE = 1GB, FILEGROWTH = 128MB ); GO ALTER DATABASE SortTest SET ALLOW_SNAPSHOT_ISOLATION OFF ; ALTER DATABASE SortTest SET AUTO_CLOSE OFF ; ALTER DATABASE SortTest SET AUTO_CREATE_STATISTICS ON ; ALTER DATABASE SortTest SET AUTO_SHRINK OFF ; ALTER DATABASE SortTest SET AUTO_UPDATE_STATISTICS ON ; ALTER DATABASE SortTest SET AUTO_UPDATE_STATISTICS_ASYNC ON ; ALTER DATABASE SortTest SET PARAMETERIZATION SIMPLE ; ALTER DATABASE SortTest SET READ_COMMITTED_SNAPSHOT OFF ; ALTER DATABASE SortTest SET MULTI_USER ; ALTER DATABASE SortTest SET RECOVERY SIMPLE ; USE SortTest; GO CREATE TABLE dbo.TestCHAR ( id INTEGER IDENTITY (1,1) NOT NULL, padding CHAR(3999) NOT NULL, CONSTRAINT [PK dbo.TestCHAR (id)] PRIMARY KEY CLUSTERED (id), ) ; CREATE TABLE dbo.TestMAX ( id INTEGER IDENTITY (1,1) NOT NULL, padding VARCHAR(MAX) NOT NULL, CONSTRAINT [PK dbo.TestMAX (id)] PRIMARY KEY CLUSTERED (id), ) ; CREATE TABLE dbo.TestTEXT ( id INTEGER IDENTITY (1,1) NOT NULL, padding TEXT NOT NULL, CONSTRAINT [PK dbo.TestTEXT (id)] PRIMARY KEY CLUSTERED (id), ) ; -- ============= -- Load TestCHAR (about 3s) -- ============= INSERT INTO dbo.TestCHAR WITH (TABLOCKX) ( padding ) SELECT padding = REPLICATE(CHAR(65 + (Data.n % 26)), 3999) FROM ( SELECT TOP (50000) n = ROW_NUMBER() OVER (ORDER BY (SELECT 0)) - 1 FROM master.sys.columns C1, master.sys.columns C2, master.sys.columns C3 ORDER BY n ASC ) AS Data ORDER BY Data.n ASC ; -- ============ -- Load TestMAX (about 3s) -- ============ INSERT INTO dbo.TestMAX WITH (TABLOCKX) ( padding ) SELECT CONVERT(VARCHAR(MAX), padding) FROM dbo.TestCHAR ORDER BY id ; -- ============= -- Load TestTEXT (about 5s) -- ============= INSERT INTO dbo.TestTEXT WITH (TABLOCKX) ( padding ) SELECT CONVERT(TEXT, padding) FROM dbo.TestCHAR ORDER BY id ; -- ========== -- Space used -- ========== -- EXECUTE sys.sp_spaceused @objname = 'dbo.TestCHAR'; EXECUTE sys.sp_spaceused @objname = 'dbo.TestMAX'; EXECUTE sys.sp_spaceused @objname = 'dbo.TestTEXT'; ; CHECKPOINT ; That takes around 15 seconds to run, and shows the space allocated to each table in its output: To illustrate the points I want to make today, the example task we are going to set ourselves is to return a random set of 150 rows from each table. The basic shape of the test query is the same for each of the three test tables: SELECT TOP (150) T.id, T.padding FROM dbo.Test AS T ORDER BY NEWID() OPTION (MAXDOP 1) ; Test 1 – CHAR(3999) Running the template query shown above using the TestCHAR table as the target, we find that the query takes around 5 seconds to return its results. This seems slow, considering that the table only has 50,000 rows. Working on the assumption that generating a GUID for each row is a CPU-intensive operation, we might try enabling parallelism to see if that speeds up the response time. Running the query again (but without the MAXDOP 1 hint) on a machine with eight logical processors, the query now takes 10 seconds to execute – twice as long as when run serially. Rather than attempting further guesses at the cause of the slowness, let’s go back to serial execution and add some monitoring. The script below monitors STATISTICS IO output and the amount of tempdb used by the test query. We will also run a Profiler trace to capture any warnings generated during query execution. DECLARE @read BIGINT, @write BIGINT ; SELECT @read = SUM(num_of_bytes_read), @write = SUM(num_of_bytes_written) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; SET STATISTICS IO ON ; SELECT TOP (150) TC.id, TC.padding FROM dbo.TestCHAR AS TC ORDER BY NEWID() OPTION (MAXDOP 1) ; SET STATISTICS IO OFF ; SELECT tempdb_read_MB = (SUM(num_of_bytes_read) - @read) / 1024. / 1024., tempdb_write_MB = (SUM(num_of_bytes_written) - @write) / 1024. / 1024., internal_use_MB = ( SELECT internal_objects_alloc_page_count / 128.0 FROM sys.dm_db_task_space_usage WHERE session_id = @@SPID ) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; Let’s take a closer look at the statistics and query plan generated from this: Following the flow of the data from right to left, we see the expected 50,000 rows emerging from the Clustered Index Scan, with a total estimated size of around 191MB. The Compute Scalar adds a column containing a random GUID (generated from the NEWID() function call) for each row. With this extra column in place, the size of the data arriving at the Sort operator is estimated to be 192MB. Sort is a blocking operator – it has to examine all of the rows on its input before it can produce its first row of output (the last row received might sort first). This characteristic means that Sort requires a memory grant – memory allocated for the query’s use by SQL Server just before execution starts. In this case, the Sort is the only memory-consuming operator in the plan, so it has access to the full 243MB (248,696KB) of memory reserved by SQL Server for this query execution. Notice that the memory grant is significantly larger than the expected size of the data to be sorted. SQL Server uses a number of techniques to speed up sorting, some of which sacrifice size for comparison speed. Sorts typically require a very large number of comparisons, so this is usually a very effective optimization. One of the drawbacks is that it is not possible to exactly predict the sort space needed, as it depends on the data itself. SQL Server takes an educated guess based on data types, sizes, and the number of rows expected, but the algorithm is not perfect. In spite of the large memory grant, the Profiler trace shows a Sort Warning event (indicating that the sort ran out of memory), and the tempdb usage monitor shows that 195MB of tempdb space was used – all of that for system use. The 195MB represents physical write activity on tempdb, because SQL Server strictly enforces memory grants – a query cannot ‘cheat’ and effectively gain extra memory by spilling to tempdb pages that reside in memory. Anyway, the key point here is that it takes a while to write 195MB to disk, and this is the main reason that the query takes 5 seconds overall. If you are wondering why using parallelism made the problem worse, consider that eight threads of execution result in eight concurrent partial sorts, each receiving one eighth of the memory grant. The eight sorts all spilled to tempdb, resulting in inefficiencies as the spilled sorts competed for disk resources. More importantly, there are specific problems at the point where the eight partial results are combined, but I’ll cover that in a future post. CHAR(3999) Performance Summary: 5 seconds elapsed time 243MB memory grant 195MB tempdb usage 192MB estimated sort set 25,043 logical reads Sort Warning Test 2 – VARCHAR(MAX) We’ll now run exactly the same test (with the additional monitoring) on the table using a VARCHAR(MAX) padding column: DECLARE @read BIGINT, @write BIGINT ; SELECT @read = SUM(num_of_bytes_read), @write = SUM(num_of_bytes_written) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; SET STATISTICS IO ON ; SELECT TOP (150) TM.id, TM.padding FROM dbo.TestMAX AS TM ORDER BY NEWID() OPTION (MAXDOP 1) ; SET STATISTICS IO OFF ; SELECT tempdb_read_MB = (SUM(num_of_bytes_read) - @read) / 1024. / 1024., tempdb_write_MB = (SUM(num_of_bytes_written) - @write) / 1024. / 1024., internal_use_MB = ( SELECT internal_objects_alloc_page_count / 128.0 FROM sys.dm_db_task_space_usage WHERE session_id = @@SPID ) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; This time the query takes around 8 seconds to complete (3 seconds longer than Test 1). Notice that the estimated row and data sizes are very slightly larger, and the overall memory grant has also increased very slightly to 245MB. The most marked difference is in the amount of tempdb space used – this query wrote almost 391MB of sort run data to the physical tempdb file. Don’t draw any general conclusions about VARCHAR(MAX) versus CHAR from this – I chose the length of the data specifically to expose this edge case. In most cases, VARCHAR(MAX) performs very similarly to CHAR – I just wanted to make test 2 a bit more exciting. MAX Performance Summary: 8 seconds elapsed time 245MB memory grant 391MB tempdb usage 193MB estimated sort set 25,043 logical reads Sort warning Test 3 – TEXT The same test again, but using the deprecated TEXT data type for the padding column: DECLARE @read BIGINT, @write BIGINT ; SELECT @read = SUM(num_of_bytes_read), @write = SUM(num_of_bytes_written) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; SET STATISTICS IO ON ; SELECT TOP (150) TT.id, TT.padding FROM dbo.TestTEXT AS TT ORDER BY NEWID() OPTION (MAXDOP 1, RECOMPILE) ; SET STATISTICS IO OFF ; SELECT tempdb_read_MB = (SUM(num_of_bytes_read) - @read) / 1024. / 1024., tempdb_write_MB = (SUM(num_of_bytes_written) - @write) / 1024. / 1024., internal_use_MB = ( SELECT internal_objects_alloc_page_count / 128.0 FROM sys.dm_db_task_space_usage WHERE session_id = @@SPID ) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; This time the query runs in 500ms. If you look at the metrics we have been checking so far, it’s not hard to understand why: TEXT Performance Summary: 0.5 seconds elapsed time 9MB memory grant 5MB tempdb usage 5MB estimated sort set 207 logical reads 596 LOB logical reads Sort warning SQL Server’s memory grant algorithm still underestimates the memory needed to perform the sorting operation, but the size of the data to sort is so much smaller (5MB versus 193MB previously) that the spilled sort doesn’t matter very much. Why is the data size so much smaller? The query still produces the correct results – including the large amount of data held in the padding column – so what magic is being performed here? TEXT versus MAX Storage The answer lies in how columns of the TEXT data type are stored. By default, TEXT data is stored off-row in separate LOB pages – which explains why this is the first query we have seen that records LOB logical reads in its STATISTICS IO output. You may recall from my last post that LOB data leaves an in-row pointer to the separate storage structure holding the LOB data. SQL Server can see that the full LOB value is not required by the query plan until results are returned, so instead of passing the full LOB value down the plan from the Clustered Index Scan, it passes the small in-row structure instead. SQL Server estimates that each row coming from the scan will be 79 bytes long – 11 bytes for row overhead, 4 bytes for the integer id column, and 64 bytes for the LOB pointer (in fact the pointer is rather smaller – usually 16 bytes – but the details of that don’t really matter right now). OK, so this query is much more efficient because it is sorting a very much smaller data set – SQL Server delays retrieving the LOB data itself until after the Sort starts producing its 150 rows. The question that normally arises at this point is: Why doesn’t SQL Server use the same trick when the padding column is defined as VARCHAR(MAX)? The answer is connected with the fact that if the actual size of the VARCHAR(MAX) data is 8000 bytes or less, it is usually stored in-row in exactly the same way as for a VARCHAR(8000) column – MAX data only moves off-row into LOB storage when it exceeds 8000 bytes. The default behaviour of the TEXT type is to be stored off-row by default, unless the ‘text in row’ table option is set suitably and there is room on the page. There is an analogous (but opposite) setting to control the storage of MAX data – the ‘large value types out of row’ table option. By enabling this option for a table, MAX data will be stored off-row (in a LOB structure) instead of in-row. SQL Server Books Online has good coverage of both options in the topic In Row Data. The MAXOOR Table The essential difference, then, is that MAX defaults to in-row storage, and TEXT defaults to off-row (LOB) storage. You might be thinking that we could get the same benefits seen for the TEXT data type by storing the VARCHAR(MAX) values off row – so let’s look at that option now. This script creates a fourth table, with the VARCHAR(MAX) data stored off-row in LOB pages: CREATE TABLE dbo.TestMAXOOR ( id INTEGER IDENTITY (1,1) NOT NULL, padding VARCHAR(MAX) NOT NULL, CONSTRAINT [PK dbo.TestMAXOOR (id)] PRIMARY KEY CLUSTERED (id), ) ; EXECUTE sys.sp_tableoption @TableNamePattern = N'dbo.TestMAXOOR', @OptionName = 'large value types out of row', @OptionValue = 'true' ; SELECT large_value_types_out_of_row FROM sys.tables WHERE [schema_id] = SCHEMA_ID(N'dbo') AND name = N'TestMAXOOR' ; INSERT INTO dbo.TestMAXOOR WITH (TABLOCKX) ( padding ) SELECT SPACE(0) FROM dbo.TestCHAR ORDER BY id ; UPDATE TM WITH (TABLOCK) SET padding.WRITE (TC.padding, NULL, NULL) FROM dbo.TestMAXOOR AS TM JOIN dbo.TestCHAR AS TC ON TC.id = TM.id ; EXECUTE sys.sp_spaceused @objname = 'dbo.TestMAXOOR' ; CHECKPOINT ; Test 4 – MAXOOR We can now re-run our test on the MAXOOR (MAX out of row) table: DECLARE @read BIGINT, @write BIGINT ; SELECT @read = SUM(num_of_bytes_read), @write = SUM(num_of_bytes_written) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; SET STATISTICS IO ON ; SELECT TOP (150) MO.id, MO.padding FROM dbo.TestMAXOOR AS MO ORDER BY NEWID() OPTION (MAXDOP 1, RECOMPILE) ; SET STATISTICS IO OFF ; SELECT tempdb_read_MB = (SUM(num_of_bytes_read) - @read) / 1024. / 1024., tempdb_write_MB = (SUM(num_of_bytes_written) - @write) / 1024. / 1024., internal_use_MB = ( SELECT internal_objects_alloc_page_count / 128.0 FROM sys.dm_db_task_space_usage WHERE session_id = @@SPID ) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; TEXT Performance Summary: 0.3 seconds elapsed time 245MB memory grant 0MB tempdb usage 193MB estimated sort set 207 logical reads 446 LOB logical reads No sort warning The query runs very quickly – slightly faster than Test 3, and without spilling the sort to tempdb (there is no sort warning in the trace, and the monitoring query shows zero tempdb usage by this query). SQL Server is passing the in-row pointer structure down the plan and only looking up the LOB value on the output side of the sort. The Hidden Problem There is still a huge problem with this query though – it requires a 245MB memory grant. No wonder the sort doesn’t spill to tempdb now – 245MB is about 20 times more memory than this query actually requires to sort 50,000 records containing LOB data pointers. Notice that the estimated row and data sizes in the plan are the same as in test 2 (where the MAX data was stored in-row). The optimizer assumes that MAX data is stored in-row, regardless of the sp_tableoption setting ‘large value types out of row’. Why? Because this option is dynamic – changing it does not immediately force all MAX data in the table in-row or off-row, only when data is added or actually changed. SQL Server does not keep statistics to show how much MAX or TEXT data is currently in-row, and how much is stored in LOB pages. This is an annoying limitation, and one which I hope will be addressed in a future version of the product. So why should we worry about this? Excessive memory grants reduce concurrency and may result in queries waiting on the RESOURCE_SEMAPHORE wait type while they wait for memory they do not need. 245MB is an awful lot of memory, especially on 32-bit versions where memory grants cannot use AWE-mapped memory. Even on a 64-bit server with plenty of memory, do you really want a single query to consume 0.25GB of memory unnecessarily? That’s 32,000 8KB pages that might be put to much better use. The Solution The answer is not to use the TEXT data type for the padding column. That solution happens to have better performance characteristics for this specific query, but it still results in a spilled sort, and it is hard to recommend the use of a data type which is scheduled for removal. I hope it is clear to you that the fundamental problem here is that SQL Server sorts the whole set arriving at a Sort operator. Clearly, it is not efficient to sort the whole table in memory just to return 150 rows in a random order. The TEXT example was more efficient because it dramatically reduced the size of the set that needed to be sorted. We can do the same thing by selecting 150 unique keys from the table at random (sorting by NEWID() for example) and only then retrieving the large padding column values for just the 150 rows we need. The following script implements that idea for all four tables: SET STATISTICS IO ON ; WITH TestTable AS ( SELECT * FROM dbo.TestCHAR ), TopKeys AS ( SELECT TOP (150) id FROM TestTable ORDER BY NEWID() ) SELECT T1.id, T1.padding FROM TestTable AS T1 WHERE T1.id = ANY (SELECT id FROM TopKeys) OPTION (MAXDOP 1) ; WITH TestTable AS ( SELECT * FROM dbo.TestMAX ), TopKeys AS ( SELECT TOP (150) id FROM TestTable ORDER BY NEWID() ) SELECT T1.id, T1.padding FROM TestTable AS T1 WHERE T1.id IN (SELECT id FROM TopKeys) OPTION (MAXDOP 1) ; WITH TestTable AS ( SELECT * FROM dbo.TestTEXT ), TopKeys AS ( SELECT TOP (150) id FROM TestTable ORDER BY NEWID() ) SELECT T1.id, T1.padding FROM TestTable AS T1 WHERE T1.id IN (SELECT id FROM TopKeys) OPTION (MAXDOP 1) ; WITH TestTable AS ( SELECT * FROM dbo.TestMAXOOR ), TopKeys AS ( SELECT TOP (150) id FROM TestTable ORDER BY NEWID() ) SELECT T1.id, T1.padding FROM TestTable AS T1 WHERE T1.id IN (SELECT id FROM TopKeys) OPTION (MAXDOP 1) ; SET STATISTICS IO OFF ; All four queries now return results in much less than a second, with memory grants between 6 and 12MB, and without spilling to tempdb. The small remaining inefficiency is in reading the id column values from the clustered primary key index. As a clustered index, it contains all the in-row data at its leaf. The CHAR and VARCHAR(MAX) tables store the padding column in-row, so id values are separated by a 3999-character column, plus row overhead. The TEXT and MAXOOR tables store the padding values off-row, so id values in the clustered index leaf are separated by the much-smaller off-row pointer structure. This difference is reflected in the number of logical page reads performed by the four queries: Table 'TestCHAR' logical reads 25511 lob logical reads 000 Table 'TestMAX'. logical reads 25511 lob logical reads 000 Table 'TestTEXT' logical reads 00412 lob logical reads 597 Table 'TestMAXOOR' logical reads 00413 lob logical reads 446 We can increase the density of the id values by creating a separate nonclustered index on the id column only. This is the same key as the clustered index, of course, but the nonclustered index will not include the rest of the in-row column data. CREATE UNIQUE NONCLUSTERED INDEX uq1 ON dbo.TestCHAR (id); CREATE UNIQUE NONCLUSTERED INDEX uq1 ON dbo.TestMAX (id); CREATE UNIQUE NONCLUSTERED INDEX uq1 ON dbo.TestTEXT (id); CREATE UNIQUE NONCLUSTERED INDEX uq1 ON dbo.TestMAXOOR (id); The four queries can now use the very dense nonclustered index to quickly scan the id values, sort them by NEWID(), select the 150 ids we want, and then look up the padding data. The logical reads with the new indexes in place are: Table 'TestCHAR' logical reads 835 lob logical reads 0 Table 'TestMAX' logical reads 835 lob logical reads 0 Table 'TestTEXT' logical reads 686 lob logical reads 597 Table 'TestMAXOOR' logical reads 686 lob logical reads 448 With the new index, all four queries use the same query plan (click to enlarge): Performance Summary: 0.3 seconds elapsed time 6MB memory grant 0MB tempdb usage 1MB sort set 835 logical reads (CHAR, MAX) 686 logical reads (TEXT, MAXOOR) 597 LOB logical reads (TEXT) 448 LOB logical reads (MAXOOR) No sort warning I’ll leave it as an exercise for the reader to work out why trying to eliminate the Key Lookup by adding the padding column to the new nonclustered indexes would be a daft idea Conclusion This post is not about tuning queries that access columns containing big strings. It isn’t about the internal differences between TEXT and MAX data types either. It isn’t even about the cool use of UPDATE .WRITE used in the MAXOOR table load. No, this post is about something else: Many developers might not have tuned our starting example query at all – 5 seconds isn’t that bad, and the original query plan looks reasonable at first glance. Perhaps the NEWID() function would have been blamed for ‘just being slow’ – who knows. 5 seconds isn’t awful – unless your users expect sub-second responses – but using 250MB of memory and writing 200MB to tempdb certainly is! If ten sessions ran that query at the same time in production that’s 2.5GB of memory usage and 2GB hitting tempdb. Of course, not all queries can be rewritten to avoid large memory grants and sort spills using the key-lookup technique in this post, but that’s not the point either. The point of this post is that a basic understanding of execution plans is not enough. Tuning for logical reads and adding covering indexes is not enough. If you want to produce high-quality, scalable TSQL that won’t get you paged as soon as it hits production, you need a deep understanding of execution plans, and as much accurate, deep knowledge about SQL Server as you can lay your hands on. The advanced database developer has a wide range of tools to use in writing queries that perform well in a range of circumstances. By the way, the examples in this post were written for SQL Server 2008. They will run on 2005 and demonstrate the same principles, but you won’t get the same figures I did because 2005 had a rather nasty bug in the Top N Sort operator. Fair warning: if you do decide to run the scripts on a 2005 instance (particularly the parallel query) do it before you head out for lunch… This post is dedicated to the people of Christchurch, New Zealand. © 2011 Paul White email: @[email protected] twitter: @SQL_Kiwi

Read the article

CoreData update problems

- by kpower

My app makes updates in background thread then saves context changes. And in main context there is a table view that works with NSFetchedResultsController. For some time updates work correctly, but then exception is thrown. To check this I've added NSLog(@"%@", [self.controller fetchedObjects]); to -controllerDidChangeContent:. Here is what I got: "<PRBattle: 0x6d30530> (entity: PRBattle; id: 0x6d319d0 <x-coredata://882BD521-90CD-4682-B19A-000A4976E471/PRBattle/p2> ; data: {\n battleId = \"-1\";\n finishedAt = \"2012-11-06 11:37:36 +0000\";\n opponent = \"0x6d2f730 <x-coredata://882BD521-90CD-4682-B19A-000A4976E471/PROpponent/p1>\";\n opponentScore = nil;\n score = nil;\n status = 4;\n})", "<PRBattle: 0x6d306f0> (entity: PRBattle; id: 0x6d319f0 <x-coredata://882BD521-90CD-4682-B19A-000A4976E471/PRBattle/p1> ; data: {\n battleId = \"-1\";\n finishedAt = \"2012-11-06 11:37:36 +0000\";\n opponent = \"0x6d2ddb0 <x-coredata://882BD521-90CD-4682-B19A-000A4976E471/PROpponent/p3>\";\n opponentScore = nil;\n score = nil;\n status = 4;\n})", "<PRBattle: 0x6d30830> (entity: PRBattle; id: 0x6d31650 <x-coredata://882BD521-90CD-4682-B19A-000A4976E471/PRBattle/p11> ; data: <fault>)", "<PRBattle: 0x6d306b0> (entity: PRBattle; id: 0x6d319e0 <x-coredata://882BD521-90CD-4682-B19A-000A4976E471/PRBattle/p5> ; data: {\n battleId = 325;\n finishedAt = nil;\n opponent = \"0x6d2f730 <x-coredata://882BD521-90CD-4682-B19A-000A4976E471/PROpponent/p1>\";\n opponentScore = 91;\n score = 59;\n status = 3;\n})", "<PRBattle: 0x6d30730> (entity: PRBattle; id: 0x6d31a00 <x-coredata://882BD521-90CD-4682-B19A-000A4976E471/PRBattle/p6> ; data: {\n battleId = 323;\n finishedAt = nil;\n opponent = \"0x6d2ddb0 <x-coredata://882BD521-90CD-4682-B19A-000A4976E471/PROpponent/p3>\";\n opponentScore = 0;\n score = 0;\n status = 3;\n})", "<PRBattle: 0x6d307b0> (entity: PRBattle; id: 0x6d31630 <x-coredata://882BD521-90CD-4682-B19A-000A4976E471/PRBattle/p9> ; data: {\n battleId = 370;\n finishedAt = \"2012-11-06 14:24:14 +0000\";\n opponent = \"0x79a8e90 <x-coredata://882BD521-90CD-4682-B19A-000A4976E471/PROpponent/p2>\";\n opponentScore = 180;\n score = 180;\n status = 4;\n})", "<PRBattle: 0x6d307f0> (entity: PRBattle; id: 0x6d31640 <x-coredata://882BD521-90CD-4682-B19A-000A4976E471/PRBattle/p10> ; data: {\n battleId = 309;\n finishedAt = \"2012-11-02 01:19:27 +0000\";\n opponent = \"0x79a8e90 <x-coredata://882BD521-90CD-4682-B19A-000A4976E471/PROpponent/p2>\";\n opponentScore = 120;\n score = 240;\n status = 4;\n})", "<PRBattle: 0x6d30770> (entity: PRBattle; id: 0x6d31620 <x-coredata://882BD521-90CD-4682-B19A-000A4976E471/PRBattle/p7> ; data: {\n battleId = 315;\n finishedAt = \"2012-11-02 02:26:24 +0000\";\n opponent = \"0x79a8e90 <x-coredata://882BD521-90CD-4682-B19A-000A4976E471/PROpponent/p2>\";\n opponentScore = 119;\n score = 179;\n status = 4;\n})" ) Faulted object (0xe972610) here causes crash. I've logged data during update & before saving. This object is in updatedObjects only. Why can this method return "bad" object? (Moreover, during updates this object is affected almost each update. And only after some passes becomes "bad" one). P.S.: I use RestKit to manage CoreData. UPDATED: The exception was got, when I did smth. like this: for (PRBattle *battle in [self.controller fetchedObjects) { switch (battle.statusScalar) { case ... default: [battle willAccessValueForKey:nil]; NSAssert1(NO, @"Unexpected battle status found: %@", battle); } } The exception is on line with -willAccessValueForKey:. Scalar status for battle is enum, that is bind to integer values 1..4. I've mentioned all possible values in switch's cases (above default:). And the last one has break;. So this one is possible only when battle.statusScalar returns non-enum value. Status scalar implementation in PRBattle: - (PRBattleStatuses)statusScalar { [self willAccessValueForKey:@"statusScalar"]; PRBattleStatuses result = (PRBattleStatuses)[self.status integerValue]; [self didAccessValueForKey:@"statusScalar"]; return result; } And battle.status has validation rules: - min-value: 1 - max-value: 4 - default: no value And the last thing - debug log: objc[4664]: EXCEPTIONS: throwing 0x7d33f80 (object 0xe67d2a0, a _NSCoreDataException) objc[4664]: EXCEPTIONS: searching through frame [ip=0x97b401 sp=0xbfffd9b0] for exception 0x7d33f60 objc[4664]: EXCEPTIONS: catch(id) objc[4664]: EXCEPTIONS: unwinding through frame [ip=0x97b401 sp=0xbfffd9b0] for exception 0x7d33f60 objc[4664]: EXCEPTIONS: handling exception 0x7d33f60 at 0x97b79f objc[4664]: EXCEPTIONS: rethrowing current exception objc[4664]: EXCEPTIONS: searching through frame [ip=0x97b911 sp=0xbfffd9b0] for exception 0x7d33f60 objc[4664]: EXCEPTIONS: searching through frame [ip=0x9ac8b7 sp=0xbfffdc20] for exception 0x7d33f60 objc[4664]: EXCEPTIONS: searching through frame [ip=0x97ee80 sp=0xbfffdc40] for exception 0x7d33f60 objc[4664]: EXCEPTIONS: searching through frame [ip=0x361d0 sp=0xbfffdc70] for exception 0x7d33f60 objc[4664]: EXCEPTIONS: searching through frame [ip=0xa701d8 sp=0xbfffde10] for exception 0x7d33f60 objc[4664]: EXCEPTIONS: catch(id) objc[4664]: EXCEPTIONS: unwinding through frame [ip=0x97b911 sp=0xbfffd9b0] for exception 0x7d33f60 objc[4664]: EXCEPTIONS: finishing handler objc[4664]: EXCEPTIONS: searching through frame [ip=0x97b963 sp=0xbfffd9b0] for exception 0x7d33f60 objc[4664]: EXCEPTIONS: searching through frame [ip=0x9ac8b7 sp=0xbfffdc20] for exception 0x7d33f60 objc[4664]: EXCEPTIONS: searching through frame [ip=0x97ee80 sp=0xbfffdc40] for exception 0x7d33f60 objc[4664]: EXCEPTIONS: searching through frame [ip=0x361d0 sp=0xbfffdc70] for exception 0x7d33f60 objc[4664]: EXCEPTIONS: searching through frame [ip=0xa701d8 sp=0xbfffde10] for exception 0x7d33f60 objc[4664]: EXCEPTIONS: catch(id) objc[4664]: EXCEPTIONS: unwinding through frame [ip=0x97b963 sp=0xbfffd9b0] for exception 0x7d33f60 objc[4664]: EXCEPTIONS: unwinding through frame [ip=0x9ac8b7 sp=0xbfffdc20] for exception 0x7d33f60 objc[4664]: EXCEPTIONS: unwinding through frame [ip=0x97ee80 sp=0xbfffdc40] for exception 0x7d33f60 objc[4664]: EXCEPTIONS: unwinding through frame [ip=0x361d0 sp=0xbfffdc70] for exception 0x7d33f60 objc[4664]: EXCEPTIONS: unwinding through frame [ip=0x3656f sp=0xbfffdc70] for exception 0x7d33f60 objc[4664]: EXCEPTIONS: unwinding through frame [ip=0xa701d8 sp=0xbfffde10] for exception 0x7d33f60 objc[4664]: EXCEPTIONS: handling exception 0x7d33f60 at 0xa701f5 2012-11-07 13:37:55.463 TestApp[4664:fb03] CoreData: error: Serious application error. An exception was caught from the delegate of NSFetchedResultsController during a call to -controllerDidChangeContent:. CoreData could not fulfill a fault for '0x6d31650 <x-coredata://882BD521-90CD-4682-B19A-000A4976E471/PRBattle/p10>' with userInfo { NSAffectedObjectsErrorKey = ( "<PRBattle: 0x6d30830> (entity: PRBattle; id: 0x6d31650 <x-coredata://882BD521-90CD-4682-B19A-000A4976E471/PRBattle/p10> ; data: <fault>)" ); }

Read the article

Linux Kernel - Where in the kernel are blocks of data physically written to specific disk partitions

- by SpdStr

I'm modifying the Linux kernel and am trying to find where in the kernel source blocks of data are physically written to disk partitions such as ubd0. Where does this occur in kernel source? The actual physical write call? I cannot find this. Thanks!

Read the article

Fuzzy match two hash tables?

- by alex

Hi, I'm looking for ideas on how to best match two hash tables containing string key/value pairs. Here's the actual problem I'm facing: I have structured data coming in which is imported into the database. I need to UPDATE records which are already in the DB, however, it's possible that ANY value in the source can change, therefore I don't have a reliable ID. I'm thinking of fuzzy matching two rows, source and DB and make an "educated" guess if it should be updated and inserted. Any ideas would be greatly appreciated.

Read the article

Using TTMessageController with Multiple Data Sources

- by PF1

Hi Everyone: I am wondering if there is some way to set a separate .datasource for each custom field in a TTMessageController. Right now I am using a TTMessageController subclass (and referencing the data source controller delegate) and simply setting the data source of the message controller to self. But I only believe this will work for one field with one set of options. Thanks for any help!

Read the article

Read data matrix barcodes on iPhone?

- by Saurabh

Hi All, Is there any Open Source project I can use to read data matrix codes (not QR codes, I know I can use ZXing project to read QR Codes) on iPhone? Any JAVA Open Source would also be helpful (I'll convert that into a web service and use on iPhone). Any help would be much appreciated! Thanks Saurabh

Read the article

Looking for a visualization and charting package

- by Jeff Meatball Yang

I have some specific requirements, with the most important at the top: Can plot line and stacked bar charts Can customize mouse events (hover, click) on chart data Compatible and performant with IE7/8 (likely will use excanvas.js) Can optionally control label formats, legends, colors Open source preferred, or at least can purchase the source Can be hosted locally I have seen a couple potentially good ones: Google's interactive charts (But code must be accessed via Google servers) EJSChart Flot Does anyone have experience with these, or others, and can make a recommendation?

Read the article

TextBox data binding validation

- by Koynov

*Is it possible to get validation errors(produced by the binding source through IDataErrorInfo or INotifyDataErrorInfo) *without accessing data source? The point is to get the error message which is going to be displayed. Thank you in advance.

Read the article

Populating data in multiple cascading dropdown boxes in Access 2007

- by miCRoSCoPiC_eaRthLinG

Hello all, I've been assigned the task to design a temporary customer tracking system in MS Access 2007 (sheeeesh!). The tables and relationships have all been setup successfully. But I'm running into a minor problem while trying to design the data entry form for one table... Here's a bit of explanation first. The screen contains 3 dropdown boxes (apart from other fields). 1st dropdown The first dropdown (cboMarket) represents the Market lets users select between 2 options: Domestic International Since the first dropdown contains only 2 items I didn't bother making a table for it. I added them as pre-defined list items. 2nd dropdown Once the user makes a selection in this one, the second dropdown (cboLeadCategory) loads up a list of Lead Categories, namely, Fairs & Exhibitions, Agents, Press Ads, Online Ads etc. Different sets of lead categories are utilized for the 2 markets. Hence this box is dependent on the 1st one. Structure of the bound table, named Lead_Cateogries for the 2nd combo is: ID Autonumber Lead_Type TEXT <- actually a list that takes up Domestic or International Lead_Category_Name TEXT 3rd dropdown And based on the choice of category in the 2nd one, the third one (cboLeadSource) is supposed to display a pre-defined set of lead sources belonging to the particular category. Table is named Lead_Sources and the structure is: ID Autonumber Lead_Category NUMBER <- related to ID of Lead Categories table Lead_Source TEXT When I make the selection in the 1st dropdown, the AfterUpdate event of the combo is called, which instructs the 2nd dropdown to load contents: Private Sub cboMarket_AfterUpdate() Me![cboLead_Category].Requery End Sub The Row Source of the 2nd combo contains a query: SELECT Lead_Categories.ID, Lead_Categories.Lead_Category_Name FROM Lead_Categories WHERE Lead_Categories.Lead_Type=[cboMarket] ORDER BY Lead_Categories.Lead_Category_Name; The AfterUpdate event of 2nd combo is: Private Sub cboLeadCategory_AfterUpdate() Me![cboLeadSource].Requery End Sub The Row Source of 3rd combo contains: SELECT Leads_Sources.ID, Leads_Sources.Lead_Source FROM Leads_Sources WHERE [Lead_Sources].[Lead_Category]=[Lead_Categories].[ID] ORDER BY Leads_Sources.Lead_Source; Problem When I select Market type from cboMarket, the 2nd combo cboLeadCategory loads up the appropriate Categories without a hitch. But when I select a particular Category from it, instead of the 3rd combo loading the lead source names, a modal dialog is displayed asking me to Enter a Parameter. When I enter anything into this prompt (valid or invalid data), I get yet another prompt: Why is this happening? Why isn't the 3rd box loading the source names as desired. Can any one please shed some light on where I am going wrong? Thanks, m^e

Read the article

Why does NSFetchedResultsController not conform to the UITableViewDataSource protocol?

- by mystify

First I thought I've grocked that thing: It's the data source for a UITableView. But then: It does not conform to that data source protocol. Strange! Why? How can it not?

Read the article

Im getting a data binding error when trying to embed flash into Flex 3

- by Adam

My script is: [Bindable] [Embed(source="loader.swf")] public static var Icon:Class; and my mx is: <mx:Image source="{Icon}" y="125" x="0"/> It works but when I try to export it it won't because of: Data binding will not be able to detect assignments to "Icon" What am I doing wrong?

Read the article

How to use iptables to forward all data from an IP to a Virtual Machine

- by jro

OK, in an attempt to get some response, a TL;DR version. I know that the following command: iptables -A PREROUTING -t nat -i eth0 --dport 80 --source 1.1.1.1 -j REDIRECT --to-port 8080 ... will redirect all traffic from port 80 to port 8080. The problem is that I have to do this for every port that is to be redirected. To be future-proof, I want all ports for an IP to be redirected to a different (internal) IP, so that if one might decide to enable SSH, they can directly connect without worrying about iptables. What is needed to reliable forward all traffic from an external IP, to an internal IP, and vice versa? Extended version I've scoured the internet for this, but I never got a solid answer. What I have is one physical server (HOST), with several virtual machines (VM) that need traffic redirected to them. Just getting it to work with a single machine is enough for now. The VM's run under VirtualBox, and are set to use a host-only adapter (vboxnet0). Everything seems to work, but it is greatly lagging. Both the host (CentOS 5.6) and the guest (Ubuntu 10.04) machine are running Linux. What I did was the following: Configure the VM to have a static IP in the network of the vboxnet0 adapter. Add an IP alias to the host, registering to the dedicated (outside) IP. Setup iptables to allow traffic to come through (via sysctl). Configure iptables to DNAT and SNAT data from a given IP address to the internal address. iptables commands: sudo iptables -A FORWARD -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPT sudo iptables -A POSTROUTING -t nat -j MASQUERADE iptables -t nat -I PREROUTING -d $OUT_IP -I eth0 -j DNAT --to-destination $IN_IP iptables -t nat -I POSTROUTING -s $IN_IP -o eth0 -j SNAT --to-source $OUT_IP Now the site works, but is really, really slow. I'm hoping I missed something simple, but I'm out of ideas for now. Some background info: before this, the site was working with basic port forwarding. E.g. port 80 was mapped to port 8080 using iptables. In VirtualBox (having the network adapter configured as NAT), a port forwarding the other way around made things work beautifully. The problem was twofold: first, multiple ports needed to be forwarded (for admin interfaces, https, ssh, etc). Second, it only allowed one IP address to use port 80. To resolve things, multiple external IP addresses are used for different (sub)domains. Likewise, the "VirtualBox" network will contain the virtual machines: DNS Ext. IP Adapter VM "VirtalBox" IP ------------------------------------------------------------------ a.example.com 1.1.1.1 eth0:1 vm_guest_1 192.168.56.1 b.example.com 2.2.2.2 eth0:2 vm_guest_2 192.168.56.2 c.example.com 3.3.3.3 eth0:3 vm_guest_3 192.168.56.3 And so on. Put simply, the goal is to channel all traffic from a.example.com to vm_guest_1 (of put differently, from 1.1.1.1 to 192.168.56.1). And achieve this with an acceptable speed :).

Read the article

Reading Data from DDFS ValueError: No JSON object could be decoded

- by secumind

I'm running dozens of map reduce jobs for a number of different purposes using disco. My data has grown enormous and I thought I would try using DDFS for a change rather than standard txt files. I've followed the DISCO map/reduce example Counting Words as a map/reduce job, without to much difficulty and with the help of others, Reading JSON specific data into DISCO I've gotten past one of my latest problems. I'm trying to read data in/out of ddfs to better chunk and distribute it but am having a bit of trouble. Here's an example file: file.txt {"favorited": false, "in_reply_to_user_id": null, "contributors": null, "truncated": false, "text": "I'll call him back tomorrow I guess", "created_at": "Mon Feb 13 05:34:27 +0000 2012", "retweeted": false, "in_reply_to_status_id_str": null, "coordinates": null, "in_reply_to_user_id_str": null, "entities": {"user_mentions": [], "hashtags": [], "urls": []}, "in_reply_to_status_id": null, "id_str": "168931016843603968", "place": null, "user": {"follow_request_sent": null, "profile_use_background_image": true, "profile_background_image_url_https": "https://si0.twimg.com/profile_background_images/305726905/FASHION-3.png", "verified": false, "profile_image_url_https": "https://si0.twimg.com/profile_images/1818996723/image_normal.jpg", "profile_sidebar_fill_color": "292727", "is_translator": false, "id": 113532729, "profile_text_color": "000000", "followers_count": 78, "protected": false, "location": "With My Niggas In Paris!", "default_profile_image": false, "listed_count": 0, "utc_offset": -21600, "statuses_count": 6733, "description": "Made in CHINA., Educated && Making My Own $$. Fear GOD && Put Him 1st. #TeamFollowBack #TeamiPhone\n", "friends_count": 74, "profile_link_color": "b03f3f", "profile_image_url": "http://a2.twimg.com/profile_images/1818996723/image_normal.jpg", "notifications": null, "show_all_inline_media": false, "geo_enabled": true, "profile_background_color": "1f9199", "id_str": "113532729", "profile_background_image_url": "http://a3.twimg.com/profile_background_images/305726905/FASHION-3.png", "name": "Bee'Jay", "lang": "en", "profile_background_tile": true, "favourites_count": 19, "screen_name": "OohMyBEEsNice", "url": "http://www.bitchimpaid.org", "created_at": "Fri Feb 12 03:32:54 +0000 2010", "contributors_enabled": false, "time_zone": "Central Time (US & Canada)", "profile_sidebar_border_color": "000000", "default_profile": false, "following": null}, "in_reply_to_screen_name": null, "retweet_count": 0, "geo": null, "id": 168931016843603968, "source": "<a href=\"http://twitter.com/#!/download/iphone\" rel=\"nofollow\">Twitter for iPhone</a>"} {"favorited": false, "in_reply_to_user_id": 50940453, "contributors": null, "truncated": false, "text": "@LegaMrvica @MimozaBand makasi om artis :D kadoo kadoo", "created_at": "Mon Feb 13 05:34:27 +0000 2012", "retweeted": false, "in_reply_to_status_id_str": "168653037894770688", "coordinates": null, "in_reply_to_user_id_str": "50940453", "entities": {"user_mentions": [{"indices": [0, 11], "screen_name": "LegaMrvica", "id": 50940453, "name": "Lega_thePianis", "id_str": "50940453"}, {"indices": [12, 23], "screen_name": "MimozaBand", "id": 375128905, "name": "Mimoza", "id_str": "375128905"}], "hashtags": [], "urls": []}, "in_reply_to_status_id": 168653037894770688, "id_str": "168931016868761600", "place": null, "user": {"follow_request_sent": null, "profile_use_background_image": true, "profile_background_image_url_https": "https://si0.twimg.com/profile_background_images/347686061/Galungan_dan_Kuningan.jpg", "verified": false, "profile_image_url_https": "https://si0.twimg.com/profile_images/1803845596/Picture_20124_normal.jpg", "profile_sidebar_fill_color": "DDFFCC", "is_translator": false, "id": 48293450, "profile_text_color": "333333", "followers_count": 182, "protected": false, "location": "\u00dcT: -6.906799,107.622383", "default_profile_image": false, "listed_count": 0, "utc_offset": -28800, "statuses_count": 3052, "description": "Fashion design maranatha '11 // traditional dancer (bali) at sanggar tampak siring & Natya Nataraja", "friends_count": 206, "profile_link_color": "0084B4", "profile_image_url": "http://a3.twimg.com/profile_images/1803845596/Picture_20124_normal.jpg", "notifications": null, "show_all_inline_media": false, "geo_enabled": true, "profile_background_color": "9AE4E8", "id_str": "48293450", "profile_background_image_url": "http://a0.twimg.com/profile_background_images/347686061/Galungan_dan_Kuningan.jpg", "name": "nana afiff", "lang": "en", "profile_background_tile": true, "favourites_count": 2, "screen_name": "hasnfebria", "url": null, "created_at": "Thu Jun 18 08:50:29 +0000 2009", "contributors_enabled": false, "time_zone": "Pacific Time (US & Canada)", "profile_sidebar_border_color": "BDDCAD", "default_profile": false, "following": null}, "in_reply_to_screen_name": "LegaMrvica", "retweet_count": 0, "geo": null, "id": 168931016868761600, "source": "<a href=\"http://blackberry.com/twitter\" rel=\"nofollow\">Twitter for BlackBerry\u00ae</a>"} {"favorited": false, "in_reply_to_user_id": 27260086, "contributors": null, "truncated": false, "text": "@justinbieber u were born to be somebody, and u're super important in beliebers' life. thanks for all biebs. I love u. follow me? 84", "created_at": "Mon Feb 13 05:34:27 +0000 2012", "retweeted": false, "in_reply_to_status_id_str": null, "coordinates": null, "in_reply_to_user_id_str": "27260086", "entities": {"user_mentions": [{"indices": [0, 13], "screen_name": "justinbieber", "id": 27260086, "name": "Justin Bieber", "id_str": "27260086"}], "hashtags": [], "urls": []}, "in_reply_to_status_id": null, "id_str": "168931016856178688", "place": null, "user": {"follow_request_sent": null, "profile_use_background_image": true, "profile_background_image_url_https": "https://si0.twimg.com/profile_background_images/416005864/Captura.JPG", "verified": false, "profile_image_url_https": "https://si0.twimg.com/profile_images/1808883280/Captura6_normal.JPG", "profile_sidebar_fill_color": "f5e7f3", "is_translator": false, "id": 406750700, "profile_text_color": "333333", "followers_count": 1122, "protected": false, "location": "Adentro de una supra.", "default_profile_image": false, "listed_count": 0, "utc_offset": -14400, "statuses_count": 20966, "description": "Mi \u00eddolo es @justinbieber , si te gusta \u00a1genial!, si no, solo respetalo. El cambi\u00f3 mi vida completamente y mi sue\u00f1o es conocerlo #TrueBelieber . ", "friends_count": 1015, "profile_link_color": "9404b8", "profile_image_url": "http://a1.twimg.com/profile_images/1808883280/Captura6_normal.JPG", "notifications": null, "show_all_inline_media": false, "geo_enabled": false, "profile_background_color": "f9fcfa", "id_str": "406750700", "profile_background_image_url": "http://a3.twimg.com/profile_background_images/416005864/Captura.JPG", "name": "neversaynever,right?", "lang": "es", "profile_background_tile": false, "favourites_count": 22, "screen_name": "True_Belieebers", "url": "http://www.wehavebieber-fever.tumblr.com", "created_at": "Mon Nov 07 04:17:40 +0000 2011", "contributors_enabled": false, "time_zone": "Santiago", "profile_sidebar_border_color": "C0DEED", "default_profile": false, "following": null}, "in_reply_to_screen_name": "justinbieber", "retweet_count": 0, "geo": null, "id": 168931016856178688, "source": "<a href=\"http://yfrog.com\" rel=\"nofollow\">Yfrog</a>"} I load it into DDFS with: # ddfs chunk data:test1 ./file.txt created: disco://localhost/ddfs/vol0/blob/44/file_txt-0$549-db27b-125e1 I test that the file is indeed loaded into ddfs with: # ddfs xcat data:test1 {"favorited": false, "in_reply_to_user_id": null, "contributors": null, "truncated": false, "text": "I'll call him back tomorrow I guess", "created_at": "Mon Feb 13 05:34:27 +0000 2012", "retweeted": false, "in_reply_to_status_id_str": null, "coordinates": null, "in_reply_to_user_id_str": null, "entities": {"user_mentions": [], "hashtags": [], "urls": []}, "in_reply_to_status_id": null, "id_str": "168931016843603968", "place": null, "user": {"follow_request_sent": null, "profile_use_background_image": true, "profile_background_image_url_https": "https://si0.twimg.com/profile_background_images/305726905/FASHION-3.png", "verified": false, "profile_image_url_https": "https://si0.twimg.com/profile_images/1818996723/image_normal.jpg", "profile_sidebar_fill_color": "292727", "is_translator": false, "id": 113532729, "profile_text_color": "000000", "followers_count": 78, "protected": false, "location": "With My Niggas In Paris!", "default_profile_image": false, "listed_count": 0, "utc_offset": -21600, "statuses_count": 6733, "description": "Made in CHINA., Educated && Making My Own $$. Fear GOD && Put Him 1st. #TeamFollowBack #TeamiPhone\n", "friends_count": 74, "profile_link_color": "b03f3f", "profile_image_url": "http://a2.twimg.com/profile_images/1818996723/image_normal.jpg", "notifications": null, "show_all_inline_media": false, "geo_enabled": true, "profile_background_color": "1f9199", "id_str": "113532729", "profile_background_image_url": "http://a3.twimg.com/profile_background_images/305726905/FASHION-3.png", "name": "Bee'Jay", "lang": "en", "profile_background_tile": true, "favourites_count": 19, "screen_name": "OohMyBEEsNice", "url": "http://www.bitchimpaid.org", "created_at": "Fri Feb 12 03:32:54 +0000 2010", "contributors_enabled": false, "time_zone": "Central Time (US & Canada)", "profile_sidebar_border_color": "000000", "default_profile": false, "following": null}, "in_reply_to_screen_name": null, "retweet_count": 0, "geo": null, "id": 168931016843603968, "source": "<a href=\"http://twitter.com/#!/download/iphone\" rel=\"nofollow\">Twitter for iPhone</a>"} {"favorited": false, "in_reply_to_user_id": 50940453, "contributors": null, "truncated": false, "text": "@LegaMrvica @MimozaBand makasi om artis :D kadoo kadoo", "created_at": "Mon Feb 13 05:34:27 +0000 2012", "retweeted": false, "in_reply_to_status_id_str": "168653037894770688", "coordinates": null, "in_reply_to_user_id_str": "50940453", "entities": {"user_mentions": [{"indices": [0, 11], "screen_name": "LegaMrvica", "id": 50940453, "name": "Lega_thePianis", "id_str": "50940453"}, {"indices": [12, 23], "screen_name": "MimozaBand", "id": 375128905, "name": "Mimoza", "id_str": "375128905"}], "hashtags": [], "urls": []}, "in_reply_to_status_id": 168653037894770688, "id_str": "168931016868761600", "place": null, "user": {"follow_request_sent": null, "profile_use_background_image": true, "profile_background_image_url_https": "https://si0.twimg.com/profile_background_images/347686061/Galungan_dan_Kuningan.jpg", "verified": false, "profile_image_url_https": "https://si0.twimg.com/profile_images/1803845596/Picture_20124_normal.jpg", "profile_sidebar_fill_color": "DDFFCC", "is_translator": false, "id": 48293450, "profile_text_color": "333333", "followers_count": 182, "protected": false, "location": "\u00dcT: -6.906799,107.622383", "default_profile_image": false, "listed_count": 0, "utc_offset": -28800, "statuses_count": 3052, "description": "Fashion design maranatha '11 // traditional dancer (bali) at sanggar tampak siring & Natya Nataraja", "friends_count": 206, "profile_link_color": "0084B4", "profile_image_url": "http://a3.twimg.com/profile_images/1803845596/Picture_20124_normal.jpg", "notifications": null, "show_all_inline_media": false, "geo_enabled": true, "profile_background_color": "9AE4E8", "id_str": "48293450", "profile_background_image_url": "http://a0.twimg.com/profile_background_images/347686061/Galungan_dan_Kuningan.jpg", "name": "nana afiff", "lang": "en", "profile_background_tile": true, "favourites_count": 2, "screen_name": "hasnfebria", "url": null, "created_at": "Thu Jun 18 08:50:29 +0000 2009", "contributors_enabled": false, "time_zone": "Pacific Time (US & Canada)", "profile_sidebar_border_color": "BDDCAD", "default_profile": false, "following": null}, "in_reply_to_screen_name": "LegaMrvica", "retweet_count": 0, "geo": null, "id": 168931016868761600, "source": "<a href=\"http://blackberry.com/twitter\" rel=\"nofollow\">Twitter for BlackBerry\u00ae</a>"} {"favorited": false, "in_reply_to_user_id": 27260086, "contributors": null, "truncated": false, "text": "@justinbieber u were born to be somebody, and u're super important in beliebers' life. thanks for all biebs. I love u. follow me? 84", "created_at": "Mon Feb 13 05:34:27 +0000 2012", "retweeted": false, "in_reply_to_status_id_str": null, "coordinates": null, "in_reply_to_user_id_str": "27260086", "entities": {"user_mentions": [{"indices": [0, 13], "screen_name": "justinbieber", "id": 27260086, "name": "Justin Bieber", "id_str": "27260086"}], "hashtags": [], "urls": []}, "in_reply_to_status_id": null, "id_str": "168931016856178688", "place": null, "user": {"follow_request_sent": null, "profile_use_background_image": true, "profile_background_image_url_https": "https://si0.twimg.com/profile_background_images/416005864/Captura.JPG", "verified": false, "profile_image_url_https": "https://si0.twimg.com/profile_images/1808883280/Captura6_normal.JPG", "profile_sidebar_fill_color": "f5e7f3", "is_translator": false, "id": 406750700, "profile_text_color": "333333", "followers_count": 1122, "protected": false, "location": "Adentro de una supra.", "default_profile_image": false, "listed_count": 0, "utc_offset": -14400, "statuses_count": 20966, "description": "Mi \u00eddolo es @justinbieber , si te gusta \u00a1genial!, si no, solo respetalo. El cambi\u00f3 mi vida completamente y mi sue\u00f1o es conocerlo #TrueBelieber . ", "friends_count": 1015, "profile_link_color": "9404b8", "profile_image_url": "http://a1.twimg.com/profile_images/1808883280/Captura6_normal.JPG", "notifications": null, "show_all_inline_media": false, "geo_enabled": false, "profile_background_color": "f9fcfa", "id_str": "406750700", "profile_background_image_url": "http://a3.twimg.com/profile_background_images/416005864/Captura.JPG", "name": "neversaynever,right?", "lang": "es", "profile_background_tile": false, "favourites_count": 22, "screen_name": "True_Belieebers", "url": "http://www.wehavebieber-fever.tumblr.com", "created_at": "Mon Nov 07 04:17:40 +0000 2011", "contributors_enabled": false, "time_zone": "Santiago", "profile_sidebar_border_color": "C0DEED", "default_profile": false, "following": null}, "in_reply_to_screen_name": "justinbieber", "retweet_count": 0, "geo": null, "id": 168931016856178688, "source": "<a href=\"http://yfrog.com\" rel=\"nofollow\">Yfrog</a> At this point everything is great, I load up the script that resulted from a previous Stack Post: from disco.core import Job, result_iterator import gzip def map(line, params): import unicodedata import json r = json.loads(line).get('text') s = unicodedata.normalize('NFD', r).encode('ascii', 'ignore') for word in s.split(): yield word, 1 def reduce(iter, params): from disco.util import kvgroup for word, counts in kvgroup(sorted(iter)): yield word, sum(counts) if __name__ == '__main__': job = Job().run(input=["tag://data:test1"], map=map, reduce=reduce) for word, count in result_iterator(job.wait(show=True)): print word, count NOTE: That this script runs file if the input=["file.txt"], however when I run it with "tag://data:test1" I get the following error: # DISCO_EVENTS=1 python count_normal_words.py Job@549:db30e:25bd8: Status: [map] 0 waiting, 1 running, 0 done, 0 failed 2012/11/25 21:43:26 master New job initialized! 2012/11/25 21:43:26 master Starting job 2012/11/25 21:43:26 master Starting map phase 2012/11/25 21:43:26 master map:0 assigned to solice 2012/11/25 21:43:26 master ERROR: Job failed: Worker at 'solice' died: Traceback (most recent call last): File "/home/DISCO/data/solice/01/Job@549:db30e:25bd8/usr/local/lib/python2.7/site-packages/disco/worker/__init__.py", line 329, in main job.worker.start(task, job, **jobargs) File "/home/DISCO/data/solice/01/Job@549:db30e:25bd8/usr/local/lib/python2.7/site-packages/disco/worker/__init__.py", line 290, in start self.run(task, job, **jobargs) File "/home/DISCO/data/solice/01/Job@549:db30e:25bd8/usr/local/lib/python2.7/site-packages/disco/worker/classic/worker.py", line 286, in run getattr(self, task.mode)(task, params) File "/home/DISCO/data/solice/01/Job@549:db30e:25bd8/usr/local/lib/python2.7/site-packages/disco/worker/classic/worker.py", line 299, in map for key, val in self['map'](entry, params): File "count_normal_words.py", line 12, in map File "/usr/lib64/python2.7/json/__init__.py", line 326, in loads return _default_decoder.decode(s) File "/usr/lib64/python2.7/json/decoder.py", line 366, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/usr/lib64/python2.7/json/decoder.py", line 384, in raw_decode raise ValueError("No JSON object could be decoded") ValueError: No JSON object could be decoded 2012/11/25 21:43:26 master WARN: Job killed Status: [map] 1 waiting, 0 running, 0 done, 1 failed Traceback (most recent call last): File "count_normal_words.py", line 28, in <module> for word, count in result_iterator(job.wait(show=True)): File "/usr/local/lib/python2.7/site-packages/disco/core.py", line 348, in wait timeout, poll_interval * 1000) File "/usr/local/lib/python2.7/site-packages/disco/core.py", line 309, in check_results raise JobError(Job(name=jobname, master=self), "Status %s" % status) disco.error.JobError: Job Job@549:db30e:25bd8 failed: Status dead The Error states: ValueError: No JSON object could be decoded. Again, this works fine using the text file as input but now DDFS. Any ideas, I'm open to suggestions?

Read the article

e2fsck extremely slow, although enough memory exists

- by kaefert

I've got this external USB-Disk: kaefert@blechmobil:~$ lsusb -s 2:3 Bus 002 Device 003: ID 0bc2:3320 Seagate RSS LLC As can be seen in this dmesg output, there is some problem that prevents that disk from beeing mounted: kaefert@blechmobil:~$ dmesg ... [ 113.084079] usb 2-1: new high-speed USB device number 3 using ehci_hcd [ 113.217783] usb 2-1: New USB device found, idVendor=0bc2, idProduct=3320 [ 113.217787] usb 2-1: New USB device strings: Mfr=2, Product=3, SerialNumber=1 [ 113.217790] usb 2-1: Product: Expansion Desk [ 113.217792] usb 2-1: Manufacturer: Seagate [ 113.217794] usb 2-1: SerialNumber: NA4J4N6K [ 113.435404] usbcore: registered new interface driver uas [ 113.455315] Initializing USB Mass Storage driver... [ 113.468051] scsi5 : usb-storage 2-1:1.0 [ 113.468180] usbcore: registered new interface driver usb-storage [ 113.468182] USB Mass Storage support registered. [ 114.473105] scsi 5:0:0:0: Direct-Access Seagate Expansion Desk 070B PQ: 0 ANSI: 6 [ 114.474342] sd 5:0:0:0: [sdb] 732566645 4096-byte logical blocks: (3.00 TB/2.72 TiB) [ 114.475089] sd 5:0:0:0: [sdb] Write Protect is off [ 114.475092] sd 5:0:0:0: [sdb] Mode Sense: 43 00 00 00 [ 114.475959] sd 5:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 114.477093] sd 5:0:0:0: [sdb] 732566645 4096-byte logical blocks: (3.00 TB/2.72 TiB) [ 114.501649] sdb: sdb1 [ 114.502717] sd 5:0:0:0: [sdb] 732566645 4096-byte logical blocks: (3.00 TB/2.72 TiB) [ 114.504354] sd 5:0:0:0: [sdb] Attached SCSI disk [ 116.804408] EXT4-fs (sdb1): ext4_check_descriptors: Checksum for group 3976 failed (47397!=61519) [ 116.804413] EXT4-fs (sdb1): group descriptors corrupted! ... So I went and fired up my favorite partition manager - gparted, and told it to verify and repair the partition sdb1. This made gparted call e2fsck (version 1.42.4 (12-Jun-2012)) e2fsck -f -y -v /dev/sdb1 Although gparted called e2fsck with the "-v" option, sadly it doesn't show me the output of my e2fsck process (bugreport https://bugzilla.gnome.org/show_bug.cgi?id=467925 ) I started this whole thing on Sunday (2012-11-04_2200) evening, so about 48 hours ago, this is what htop says about it now (2012-11-06-1900): PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command 3704 root 39 19 1560M 1166M 768 R 98.0 19.5 42h56:43 e2fsck -f -y -v /dev/sdb1 Now I found a few posts on the internet that discuss e2fsck running slow, for example: http://gparted-forum.surf4.info/viewtopic.php?id=13613 where they write that its a good idea to see if the disk is just that slow because maybe its damaged, and I think these outputs tell me that this is not the case in my case: kaefert@blechmobil:~$ sudo hdparm -tT /dev/sdb /dev/sdb: Timing cached reads: 3562 MB in 2.00 seconds = 1783.29 MB/sec Timing buffered disk reads: 82 MB in 3.01 seconds = 27.26 MB/sec kaefert@blechmobil:~$ sudo hdparm /dev/sdb /dev/sdb: multcount = 0 (off) readonly = 0 (off) readahead = 256 (on) geometry = 364801/255/63, sectors = 5860533160, start = 0 However, although I can read quickly from that disk, this disk speed doesn't seem to be used by e2fsck, considering tools like gkrellm or iotop or this: kaefert@blechmobil:~$ iostat -x Linux 3.2.0-2-amd64 (blechmobil) 2012-11-06 _x86_64_ (2 CPU) avg-cpu: %user %nice %system %iowait %steal %idle 14,24 47,81 14,63 0,95 0,00 22,37 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sda 0,59 8,29 2,42 5,14 43,17 160,17 53,75 0,30 39,80 8,72 54,42 3,95 2,99 sdb 137,54 5,48 9,23 0,20 587,07 22,73 129,35 0,07 7,70 7,51 16,18 2,17 2,04 Now I researched a little bit on how to find out what e2fsck is doing with all that processor time, and I found the tool strace, which gives me this: kaefert@blechmobil:~$ sudo strace -p3704 lseek(4, 41026998272, SEEK_SET) = 41026998272 write(4, "\212\354K[_\361\3nl\212\245\352\255jR\303\354\312Yv\334p\253r\217\265\3567\325\257\3766"..., 4096) = 4096 lseek(4, 48404766720, SEEK_SET) = 48404766720 read(4, "\7t\260\366\346\337\304\210\33\267j\35\377'\31f\372\252\ffU\317.y\211\360\36\240c\30`\34"..., 4096) = 4096 lseek(4, 41027002368, SEEK_SET) = 41027002368 write(4, "\232]7Ws\321\352\t\1@[+5\263\334\276{\343zZx\352\21\316`1\271[\202\350R`"..., 4096) = 4096 lseek(4, 48404770816, SEEK_SET) = 48404770816 read(4, "\17\362r\230\327\25\346//\210H\v\311\3237\323K\304\306\361a\223\311\324\272?\213\tq \370\24"..., 4096) = 4096 lseek(4, 41027006464, SEEK_SET) = 41027006464 write(4, "\367yy>x\216?=\324Z\305\351\376&\25\244\210\271\22\306}\276\237\370(\214\205G\262\360\257#"..., 4096) = 4096 lseek(4, 48404774912, SEEK_SET) = 48404774912 read(4, "\365\25\0\21|T\0\21}3t_\272\373\222k\r\177\303\1\201\261\221$\261B\232\3142\21U\316"..., 4096) = 4096 ^CProcess 3704 detached around 16 of these lines every second, so 4 read and 4 write operations every second, which I don't consider to be a lot.. And finally, my question: Will this process ever finish? If those numbers from fseek (48404774912) represent bytes, that would be something like 45 gigabytes, with this beeing a 3 terrabyte disk, which would give me 134 days to go, if the speed stays constant, and e2fsck scans the disk like this completly and only once. Do you have some advice for me? I have most of the data on that disk elsewhere, but I've put a lot of hours into sorting and merging it to this disk, so I would prefer to getting this disk up and running again, without formatting it anew. I don't think that the hardware is damaged since the disk is only a few months and since I can't see any I/O errors in the dmesg output. UPDATE: I just looked at the strace output again (2012-11-06_2300), now it looks like this: lseek(4, 1419860611072, SEEK_SET) = 1419860611072 read(4, "3#\f\2447\335\0\22A\355\374\276j\204'\207|\217V|\23\245[\7VP\251\242\276\207\317:"..., 4096) = 4096 lseek(4, 43018145792, SEEK_SET) = 43018145792 write(4, "]\206\231\342Y\204-2I\362\242\344\6R\205\361\324\177\265\317C\334V\324\260\334\275t=\10F."..., 4096) = 4096 lseek(4, 1419860615168, SEEK_SET) = 1419860615168 read(4, "\262\305\314Y\367\37x\326\245\226\226\320N\333$s\34\204\311\222\7\315\236\336\300TK\337\264\236\211n"..., 4096) = 4096 lseek(4, 43018149888, SEEK_SET) = 43018149888 write(4, "\271\224m\311\224\25!I\376\16;\377\0\223H\25Yd\201Y\342\r\203\271\24eG<\202{\373V"..., 4096) = 4096 lseek(4, 1419860619264, SEEK_SET) = 1419860619264 read(4, ";d\360\177\n\346\253\210\222|\250\352T\335M\33\260\320\261\7g\222P\344H?t\240\20\2548\310"..., 4096) = 4096 lseek(4, 43018153984, SEEK_SET) = 43018153984 write(4, "\360\252j\317\310\251G\227\335{\214`\341\267\31Y\202\360\v\374\307oq\3063\217Z\223\313\36D\211"..., 4096) = 4096 So the numbers in the lseek lines before the reads, like 1419860619264 are already a lot bigger, standing for 1.29 terabytes if those numbers are bytes, so it doesn't seem to be a linear progress on a big scale, maybe there are only some areas that need work, that have big gaps in between them. UPDATE2: Okey, big disappointment, the numbers are back to very small again (2012-11-07_0720) lseek(4, 52174548992, SEEK_SET) = 52174548992 read(4, "\374\312\22\\\325\215\213\23\0357U\222\246\370v^f(\312|f\212\362\343\375\373\342\4\204mU6"..., 4096) = 4096 lseek(4, 46603526144, SEEK_SET) = 46603526144 write(4, "\370\261\223\227\23?\4\4\217\264\320_Am\246CQ\313^\203U\253\274\204\277\2564n\227\177\267\343"..., 4096) = 4096 so either e2fsck goes over the data multiple times, or it just hops back and forth multiple times. Or my assumption that those numbers are bytes is wrong. UPDATE3: Since it's mentioned here http://forums.fedoraforum.org/showthread.php?t=282125&page=2 that you can testisk while e2fsck is running, i tried that, though not with a lot of success. When asking testdisk to display the data of my partition, this is what I get: TestDisk 6.13, Data Recovery Utility, November 2011 Christophe GRENIER <[email protected]> http://www.cgsecurity.org 1 P Linux 0 4 5 45600 40 8 732566272 Can't open filesystem. Filesystem seems damaged. And this is what strace currently gives me (2012-11-07_1030) lseek(4, 212460343296, SEEK_SET) = 212460343296 read(4, "\315Mb\265v\377Gn \24\f\205EHh\2349~\330\273\203\3375\206\10\r3=W\210\372\352"..., 4096) = 4096 lseek(4, 47347830784, SEEK_SET) = 47347830784 write(4, "]\204\223\300I\357\4\26\33+\243\312G\230\250\371*m2U\t_\215\265J \252\342Pm\360D"..., 4096) = 4096 (times are in CET)

Read the article

secure synchronization of large amount of data

- by goncalopp

I need to automatically mirror a large amount (terabytes) of files in two unix machines over a slow link (1 Mbps). This needs to be done frequently, but the data doesn't change too much (delta transmission doesn't saturate the link). The usual solution would be rsync, but there's an additional requirement: it's undesirable, from a security standpoint, that either the source or destination machines have (keyless) ssh keys to each other, or any kind of filesystem access. All communication between the two machines should thus be initialized (and mediated) through a third machine. I've asked a separate question about rsync in particular here. Are there other obvious solutions I'm missing?

Read the article

Read data from separate CSV file in OpenOffice Calc

- by Thomi

Hi, I have a very large CSV file (many thousands of rows) that I want to work with. Due to the size of the file, I don't want to import it into openoffice. Rather, I want to create a spreadsheet that contains formulas & graphs that read from this (or any other) CSV file I point it it. Ideally the spreadsheet will ask me what CSV file I want to use, allowing me to change the data source dynamically. Any ideas?

Read the article

Collaborative data modelling software?

- by at01

I'm trying to find a tool where a lot of people can work on a data model collaboratively. Embarcadero has a an ER application called ER/studio which apparently comes with a repository system that acts like typical version control software. That sounds great except ER/studio is expensive and this is a non-profit and open source organization where we encourage members to even contribute small changes. What's the best solution? Either downloadable software or a web service would work. We don't mind paying, but the cost can't go up with the number of participants...

Read the article

Finding optimal ddrescue command line options where Accuracy > Speed

- by gav

I'm read up a bit about this tool and obviously looked at the man pages. The trouble is that ddrescue takes so long I need to get the command right first time. I wasn't sure how to improve on the vanilla; $ sudo ./ddrescue -v /dev/disk0s5 MyVolImage.dmg MyVolRescue.log $ sudo ./ddrescue -v MyVolImage.dmg /dev/disk1s3 MyVolRestore.log From HSF+ to HSF+ drives Source (Broken) HDD is connected via USB 2.0 Dest HDD is inside MacBook I would choose accuracy over speed There seem to be a lot of options but I'm not sure how they impact quality and speed of recovery. Thanks, Gav

Read the article

Reconstructing the disk order in RAID 6 with 7 disks

- by rkotulla

a little background to this question first: I am running a RAID-6 within a QNAP TS869L external RAID/NAS system. I started with 5 disks of 3 TB each back in the day, and later added another 2 disks of 3TB to the RAID. The QNAP internals handled the growing and re-syncing etc, and everything seemd to be perfectly fine. About 2 weeks ago, I had one of the disks (disk #5, disk #2 has gone bad in the mean time) fail, and somehow (I have no idea why), also disks 1 and 2 got kicked out of the array. I replaced disk #5, but the RAID didn't start working again. After some calls to QNAP technical support, they re-created the array (using mdadm --create --force --assume-clean ...), but the resulting array couldn't find a filesystem, and I was kindly referred to contact a data recovery company that I can't afford. After some digging through old log files, resetting the disk to factory default, etc, I found a few errors that were made during this re-create - I wish I still had some of the original metadata, but unfortunately i don't (I definitely learned that lesson). I'm currently at the point where I know the correct chunk-size (64K), metadata-version (1.0; factory default was 0.9, but from what I read 0.9 doesn't handle disks over 2 TB, mine are 3 TB), and I now find the ext4 filesystem that should be on the disks. Only variable left to determine is the right disk order! I started using the description found in answer #4 of "Recover RAID 5 data after created new array instead of re-using" but am a little confused on what the order should be for a proper RAID-6. RAID-5 is pretty well documented in a number of places, but RAID-6 much less so. Also, does the layout, i.e. distribution of parity and data chunks across the disks, change after the growing of the array from 5 to 7 disks, or does the re-sync re-organize them in such a way a native 7-disk RAID-6 would have been? Thanks some more mdadm output that might be helpful: mdadm version: [~] # mdadm --version mdadm - v2.6.3 - 20th August 2007 mdadm details from one of the disks in the array: [~] # mdadm --examine /dev/sda3 /dev/sda3: Magic : a92b4efc Version : 1.0 Feature Map : 0x0 Array UUID : 1c1614a5:e3be2fbb:4af01271:947fe3aa Name : 0 Creation Time : Tue Jun 10 10:27:58 2014 Raid Level : raid6 Raid Devices : 7 Used Dev Size : 5857395112 (2793.02 GiB 2998.99 GB) Array Size : 29286975360 (13965.12 GiB 14994.93 GB) Used Size : 5857395072 (2793.02 GiB 2998.99 GB) Super Offset : 5857395368 sectors State : clean Device UUID : 7c572d8f:20c12727:7e88c888:c2c357af Update Time : Tue Jun 10 13:01:06 2014 Checksum : d275c82d - correct Events : 7036 Chunk Size : 64K Array Slot : 0 (0, 1, failed, 3, failed, 5, 6) Array State : Uu_u_uu 2 failed mdadm details for the array in the current disk-order (based on my best guess reconstructed from old log-files) [~] # mdadm --detail /dev/md0 /dev/md0: Version : 01.00.03 Creation Time : Tue Jun 10 10:27:58 2014 Raid Level : raid6 Array Size : 14643487680 (13965.12 GiB 14994.93 GB) Used Dev Size : 2928697536 (2793.02 GiB 2998.99 GB) Raid Devices : 7 Total Devices : 5 Preferred Minor : 0 Persistence : Superblock is persistent Update Time : Tue Jun 10 13:01:06 2014 State : clean, degraded Active Devices : 5 Working Devices : 5 Failed Devices : 0 Spare Devices : 0 Chunk Size : 64K Name : 0 UUID : 1c1614a5:e3be2fbb:4af01271:947fe3aa Events : 7036 Number Major Minor RaidDevice State 0 8 3 0 active sync /dev/sda3 1 8 19 1 active sync /dev/sdb3 2 0 0 2 removed 3 8 51 3 active sync /dev/sdd3 4 0 0 4 removed 5 8 99 5 active sync /dev/sdg3 6 8 83 6 active sync /dev/sdf3 output from /proc/mdstat (md8, md9, and md13 are internally used RAIDs holding swap, etc; the one I'm after is md0) [~] # more /proc/mdstat Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] md0 : active raid6 sdf3[6] sdg3[5] sdd3[3] sdb3[1] sda3[0] 14643487680 blocks super 1.0 level 6, 64k chunk, algorithm 2 [7/5] [UU_U_UU] md8 : active raid1 sdg2[2](S) sdf2[3](S) sdd2[4](S) sdc2[5](S) sdb2[6](S) sda2[1] sde2[0] 530048 blocks [2/2] [UU] md13 : active raid1 sdg4[3] sdf4[4] sde4[5] sdd4[6] sdc4[2] sdb4[1] sda4[0] 458880 blocks [8/7] [UUUUUUU_] bitmap: 21/57 pages [84KB], 4KB chunk md9 : active raid1 sdg1[6] sdf1[5] sde1[4] sdd1[3] sdc1[2] sda1[0] sdb1[1] 530048 blocks [8/7] [UUUUUUU_] bitmap: 37/65 pages [148KB], 4KB chunk unused devices: <none>

Read the article

Foremost custom file type not accepted by -t argument

- by Channel72

I'm trying to recover a deleted file on an ext3 file system using the foremost utility. The file I want to recover is a hpp C++ source code file. However, foremost does not automatically support the hpp file extension, so I have to add it to the config file. So, following the instructions on the man page, I add the following line to the config file: hpp n 50000 include include ASCII Then I run foremost as follows: $foremost -v -T -t hpp -i /dev/md0 -o /home/recover/ Instead of doing anything, it just displays the help message. If I change the hpp to htm or jpg, it works. So apparently foremost isn't accepting the custom file type I added into the config file. But I've looked over this dozens of times now, and I can't see what I'm doing wrong. I'm following the instructions exactly. Why doesn't foremost recognize the new file type I added to the config file?

Read the article

Able to ping but does not get the data

- by Dany

I am facing a problem in my client server program; when using wireless I can ping but not receive any data. There is a source which receives a streaming request from client via server. This works fine when all the machines are connected through LAN cable but when I put all the computers in wi-fi network, all the machine are able to ping each other but when the client send the stream request to the server the ping request between server and client says destination unreachable. It works all well until the client does not send the streaming request. What might be the issue?

Read the article

"The server closed the connection without sending any data"

- by Toby

Server setup The problem Diagnostic information What I've tried Specific Help needed 1. I have the following server setup: Debian Squeeze Linux 2.6.32-5-amd64 Apache2-mpm-prefork 2.2.16-6+squeeze10 PHP 5.3.3-7+squeeze14 This server is protected with the Suhosin Patch 0.9.9.1 Max Requests Per Child: 0 - Keep Alive: on - Max Per Connection: 100 Timeouts Connection: 300 - Keep-Alive: 15 Loaded Modules core mod_log_config mod_logio prefork http_core mod_so mod_alias mod_auth_basic mod_auth_digest mod_authn_file mod_authz_default mod_authz_groupfile mod_authz_host mod_authz_user mod_cgi mod_deflate mod_dir mod_env mod_mime mod_negotiation mod_php5 mod_reqtimeout mod_rewrite mod_setenvif mod_ssl mod_status Wordpress 3.4.2 (Upgrading to 3.5 soon :) 2. The problem When I restart the server (sudo shutdown -r now), going to any website page results in the following error from the web browser (in this case, Google Chrome, but other browsers also show the same error). This error can also occur an hour or so after all is working ok, seemingly randomly, which is my biggest concern as it means my server is not reliable: No data received Unable to load the web page because the server sent no data. Here are some suggestions: Reload this web page later. Error 324 (net::ERR_EMPTY_RESPONSE): The server closed the connection without sending any data. 3. Diagnostic information The apache error log contains the folowing entries: [Fri Dec 14 22:23:27 2012] [notice] child pid 1955 exit signal Floating point exception (8) [Fri Dec 14 22:23:27 2012] [notice] child pid 1956 exit signal Floating point exception (8) [Fri Dec 14 22:23:29 2012] [notice] child pid 1957 exit signal Floating point exception (8) [Fri Dec 14 22:23:30 2012] [notice] child pid 1958 exit signal Floating point exception (8) [Fri Dec 14 22:23:32 2012] [notice] child pid 1959 exit signal Floating point exception (8) [Fri Dec 14 22:23:32 2012] [notice] child pid 1960 exit signal Floating point exception (8) [Fri Dec 14 22:23:34 2012] [notice] child pid 1961 exit signal Floating point exception (8) [Fri Dec 14 22:23:34 2012] [notice] child pid 1962 exit signal Floating point exception (8) 4. What I've tried a) I can 'fix' the website temporarily by resetting the server twice (resetting it once does not work) using the following commands. NB: the 'reload' option does not work, I have to use restart twice. However, the error can reoccur sometime later. sudo /etc/init.d/apache2 restart sudo /etc/init.d/apache2 restart b) I have tried disabling suhosin by uninstalling php5-suhosin, but a php info page still shows "This server is protected with the Suhosin Patch 0.9.9.1". I have tried putting Suhosin into simulation mode by creating a file /etc/php5/apache2/conf.d/suhosin.ini containing: [suhosin] suhosin.simulation = On The php info page shows the suhosin.ini file in the list of "Additional .ini files parsed" but the php info page still shows "This server is protected with the Suhosin Patch 0.9.9.1" c) Increasing the PHP memory limit In /etc/php5/apache2/ : ; Maximum amount of memory a script may consume (128MB) ; http://php.net/memory-limit memory_limit = 512M d) Disabling all Wordpress plugins, and going back to the default theme. 5. Specific help needed I would very much like help in debugging what is going on here. I am not sure how to determine what processes are in the Apache error log which are exiting "[notice] child pid 1955 exit signal Floating point exception (8)", or what is causing them to exit. And whether suhosin is part of the problem (and how to disable it if it is). Thank you in advance for any advice or tips you can offer in helping me debug this.

Read the article

How to save a ntfs partition which suddenly became empty

- by SteveO

One ntfs partition of my laptop was suddenly wiped out without any notice to me, when I rebooted from Windows 7 to Ubuntu 12.04 today. I am in need of help to save my files on that partition, which are important and unfortunately haven't been backed up yet. My laptop has two operating systems: Windows 7 and Ubuntu 12.04. with a ntfs partition shared between the two operating systems for storing some data files (109GB, about 97%of which has been used). I have almost always been using Ubuntu, but today I happened to have to work under Windows. Following is a record of what happened in the time order, numbering according to which operating system I was in at each stage. When I started into Windows 7, right before being able to log in, it took a while and two reboots to configure the Windows. I thought it was normal, since last time when I was using Windows two weeks ago, it took very long and several reboots to update Windows, since the last time I used Windows before then was in November last year. Then after finally being able to log in Windows 7, I installed Libre Office, MathType (I got it from http://dl.portablesoft.org/down/?id=2515, which I originally thought was a trial version, but later I learned was a cracked version and felt wrong. I made a copy of it at dropbox http://dl.dropbox.com/u/13029929/MathType_6.8_PortableSoft.rar, not for distributing it but to list it there just in case it will help to identify the problem), and MikTex. I then edited some .doc files in the ntfs partition under both Microsoft Office with MathType, and Libre Office. When I finished working under Windows and rebooted into Ubuntu, Ubuntu did some filesystem checking and reported that the ntfs partition was not able to be mounted. Then I rebooted again into Windows, and found that the ntfs partition had been emptied, i.e. all the data files were gone, and only one system file bootsqm.dat and one system directory System Volume Information were there, with their last updated time being the time when I first rebooted from Windows to Ubuntu (in fact, it is 4 hours in advanced than the actual time of that rebooting , see immediately below) Also I noticed that the time shown by Windows is not correct for my time zone (UTC-05:00) Eastern Time (US & Canada)), which is 4 hours in advance than the correct time (my current time is 3am, but the computer shows 7am). Same things happened when I rebooted into Ubuntu again: the ntfs has been emptied and left with only one Windows system file bootsqm.dat and one Windows system directory System Volume Information. the time shown by Ubuntu is 4 hours in advance than the correct time. I wonder what I can do to retrieve my data files back on the ntfs partition? If I am not able to do it myself, will some professionals be able to help me out? Thanks a lot! PS: I didn't think I did any thing that required emptying that partition. But there were quite some works I did during that stage right before the reboot from Windows to Ubuntu when the problem occured. Did I make any mis-operation?

Search Results

Search found 76470 results on 3059 pages for 'data source'.

Page 288/3059 | < Previous Page | 284 285 286 287 288 289 290 291 292 293 294 295 | Next Page >

- by user1388172

- by Jason Lee

- by Paul White

- by kpower

- by SpdStr

- by alex

- by PF1

- by Saurabh

- by Jeff Meatball Yang

- by Koynov

- by miCRoSCoPiC_eaRthLinG

- by mystify

- by Adam

- by jro

- by secumind

- by kaefert

- by goncalopp

- by Thomi

- by at01

- by gav

- by rkotulla

- by Channel72

- by Dany

- by Toby

- by SteveO

< Previous Page | 284 285 286 287 288 289 290 291 292 293 294 295 | Next Page >