efficient storage - Page 339

FILESTREAM/FILETABLE Clarifications for Implementation

- by user1209734

Recently our team was looking at FILESTREAM to expand the capabilities of our proprietary application. The main purpose of this app is managing the various PDFS, Images and documents to all of the parts we manufacture. Our ASP application uses a few third party tools to allow viewing of these files. We currently have 980GB of data on the Fileserver. We have around 200GB of Binary data in SQL Server that we would like to extract since it is not performing well hence FILESTREAM seems to be a good compromise to the two major data storage/access issues. A few things are not exactly clear to us: FILESTREAM Can or Cannot store its data on a drive that is not locally attached. We already have a File Server with a RAID 10 (1.5TB drives). This server stores all of the documents right now, would we have to move these drives to the SQL Server for FILESTREAM? That would be a tough bullet to bite since the server also is doubling as the Application Server (Two VMs on one physical server). FILETABLE stores the common metadata about the files but where is the Full Text part of it stored to allow searching of files like doc/docx? Is this separate? Are you able to freely add criteria to this to search by? If so any links to clarify would be appreciated. Can FILETABLE be referenced in another table with a foreign key? Thank you in advance EDIT: For those having these questions this web video covered everything and more in terms of explaining filestream from 2008 to 2012 and the cavets to consider (I would seriously rep him if I could): http://channel9.msdn.com/Events/TechDays/Techdays-2012-the-Netherlands/2270 In conclusion we will not be using FILESTREAM as it would be way to huge of an upsurge to accommodate for investment. EDIT 2: Update to #1 - After carefully assessing FileTable in addition to FILESTREAM we got a winning combination. We did have to move the files over to the new server (wasn't to painful since they were on the same VM).It honestly took more time to write an extraction tool to dump the binary data within SQL to the File System. Update to #2 - This was seperate but again Bob had an excellent webinar explaining this: http://channel9.msdn.com/Events/TechEd/Europe/2012/DBI411 Update to #3 - Using TFT inheritance we recycled the Docs table we had (minus the huge binary blobs) which required very little changes in our legacy apps. This was a huge upshot for the developer team.

Read the article

Databinding to ObservableCollection in a different UserControl?

- by Dave

Question re-written on 2010-03-24 I have two UserControls, where one is a dialog that has a TabControl, and the other is one that appears within said TabControl. I'll just call them CandyDialog and CandyNameViewer for simplicity's sake. There's also a data management class called Tracker that manages information storage, which for all intents and purposes just exposes a public property that is an ObservableCollection. I display the CandyNameViewer in CandyDialog via code behind, like this: private void CandyDialog_Loaded( object sender, RoutedEventArgs e) { _candyviewer = new CandyViewer(); _candyviewer.DataContext = _tracker; candy_tab.Content = _candyviewer; } The CandyViewer's XAML looks like this (edited for kaxaml): <Page xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation" xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"> <Page.Resources> <DataTemplate x:Key="CandyItemTemplate"> <Grid> <Grid.ColumnDefinitions> <ColumnDefinition Width="120"></ColumnDefinition> <ColumnDefinition Width="150"></ColumnDefinition> </Grid.ColumnDefinitions> <TextBox Grid.Column="0" Text="{Binding CandyName}" Margin="3"></TextBox>  <ComboBox Grid.Column="1" SelectedItem="{Binding SelectedCandy, Mode=TwoWay}" ItemsSource="{Binding RelativeSource={RelativeSource FindAncestor, AncestorType={x:Type UserControl}}, Path=DataContext.CandyNames}" Margin="3"></ComboBox> </Grid> </DataTemplate> </Page.Resources> <Grid> <ListBox DockPanel.Dock="Top" ItemsSource="{Binding CandyBoxContents, Mode=TwoWay}" ItemTemplate="{StaticResource CandyItemTemplate}" /> </Grid> </Page> Now everything works fine when the controls are loaded. As long as CandyNames is populated first, and then the consumer UserControl is displayed, all of the names are there. I obviously don't get any errors in the Output Window or anything like that. The issue I have is that when the ObservableCollection is modified from the model, those changes are not reflected in the consumer UserControl! I've never had this problem before; all of my previous uses of ObservableCollection updated fine, although in those cases I wasn't databinding across assemblies. Although I am currently only adding and removing candy names to/from the ObservableCollection, at a later date I will likely also allow renaming from the model side. Is there something I did wrong? Is there a good way to actually debug this? Reed Copsey indicates here that inter-UserControl databinding is possible. Unfortunately, my favorite Bea Stollnitz article on WPF databinding debugging doesn't suggest anything that I could use for this particular problem.

Read the article

PostgreSQL to Data-Warehouse: Best approach for near-real-time ETL / extraction of data

- by belvoir

Background: I have a PostgreSQL (v8.3) database that is heavily optimized for OLTP. I need to extract data from it on a semi real-time basis (some-one is bound to ask what semi real-time means and the answer is as frequently as I reasonably can but I will be pragmatic, as a benchmark lets say we are hoping for every 15min) and feed it into a data-warehouse. How much data? At peak times we are talking approx 80-100k rows per min hitting the OLTP side, off-peak this will drop significantly to 15-20k. The most frequently updated rows are ~64 bytes each but there are various tables etc so the data is quite diverse and can range up to 4000 bytes per row. The OLTP is active 24x5.5. Best Solution? From what I can piece together the most practical solution is as follows: Create a TRIGGER to write all DML activity to a rotating CSV log file Perform whatever transformations are required Use the native DW data pump tool to efficiently pump the transformed CSV into the DW Why this approach? TRIGGERS allow selective tables to be targeted rather than being system wide + output is configurable (i.e. into a CSV) and are relatively easy to write and deploy. SLONY uses similar approach and overhead is acceptable CSV easy and fast to transform Easy to pump CSV into the DW Alternatives considered .... Using native logging (http://www.postgresql.org/docs/8.3/static/runtime-config-logging.html). Problem with this is it looked very verbose relative to what I needed and was a little trickier to parse and transform. However it could be faster as I presume there is less overhead compared to a TRIGGER. Certainly it would make the admin easier as it is system wide but again, I don't need some of the tables (some are used for persistent storage of JMS messages which I do not want to log) Querying the data directly via an ETL tool such as Talend and pumping it into the DW ... problem is the OLTP schema would need tweaked to support this and that has many negative side-effects Using a tweaked/hacked SLONY - SLONY does a good job of logging and migrating changes to a slave so the conceptual framework is there but the proposed solution just seems easier and cleaner Using the WAL Has anyone done this before? Want to share your thoughts?

Read the article

Issues with MIPS interrupt for tv remote simulator

- by pred2040

Hello I am writing a program for class to simulate a tv remote in a MIPS/SPIM enviroment. The functions of the program itself are unimportant as they worked fine before the interrupt so I left them all out. The gaol is basically to get a input from the keyboard by means of interupt, store it in $s7 and process it. The interrupt is causing my program to repeatedly spam the errors: Exception occurred at PC=0x00400068 Bad address in data/stack read: 0x00000004 Exception occurred at PC=0x00400358 Bad address in data/stack read: 0x00000000 program starts here .data msg_tvworking: .asciiz "tv is working\n" msg_sec: .asciiz "sec -- " msg_on: .asciiz "Power On" msg_off: .asciiz "Power Off" msg_channel: .asciiz " Channel " msg_volume: .asciiz " Volume " msg_sleep: .asciiz " Sleep Timer: " msg_dash: .asciiz "-\n" msg_newline: .asciiz "\n" msg_comma: .asciiz ", " array1: .space 400 # 400 bytes of storage for 100 channels array2: .space 400 # copy of above for sorting var1: .word 0 # 1 if 0-9 is pressed, 0 if not var2: .word 0 # stores number of channel (ex. 2-) var3: .word 0 # channel timer var4: .word 0 # 1 if s pressed once, 2 if twice, 0 if not var5: .word 0 # sleep wait timer var6: .word 0 # program timer var9: .float 0.01 # for channel timings .kdata var7: .word 10 var8: .word 11 .text .globl main main: li $s0, 300 li $s1, 0 # channel li $s2, 50 # volume li $s3, 1 # power - 1:on 0:off li $s4, 0 # sleep timer - 0:off li $s5, 0 # temporary li $s6, 0 # length of sleep period li $s7, 10000 # current key press li $t2, 0 # temp value not needed across calls li $t4, 0 interrupt data here mfc0 $a0, $12 ori $a0, 0xff11 mtc0 $a0, $12 lui $t0, 0xFFFF ori $a0, $0, 2 sw $a0, 0($t0) mainloop: # 1. get external input, and process it # input from interupt is taken from $a2 and placed in $s7 #for processing beq $a2, $0, next lw $s7, 4($a2) li $a2, 0 # call the process_input function here # jal process_input next: # 2. check sleep timer mainloopnext1: # 3. delay for 10ms jal delay_10ms jal check_timers jal channel_time # 4. print status lw $s5, var6 addi $s5, $s5, 1 sw $s5, var6 addi $s0, $s0, -1 bne $s0, $0, mainloopnext4 li $s0, 300 jal status_print mainloopnext4: j mainloop li $v0,10 # exit syscall -------------------------------------------------- status_print: seconds_stat: power_stat: on_stat: off_stat: channel_stat: volume_stat: sleep_stat: j $ra -------------------------------------------------- delay_10ms: li $t0, 6000 delay_10ms_loop: addi $t0, $t0, -1 bne $t0, $0, delay_10ms_loop jr $ra -------------------------------------------------- check_timers: channel_press: sleep_press: go_back_press: channel_check: channel_ignore: sleep_check: sleep_ignore: j $ra ------------------------------------------------ process_input: beq $s7, 112, power beq $s7, 117, channel_up beq $s7, 100, channel_down beq $s7, 108, volume_up beq $s7, 107, volume_down beq $s7, 115, sleep_init beq $s7, 118, history bgt $s7, 47, end_range jr $ra end_range: power: on: off: channel_up: over: channel_down: under: channel_message: channel_time: volume_up: volume_down: volume_message: sleep_init: sleep_incr: sleep: sleep_reset: history: digit_pad_init: digit_pad: jr $ra -------------------------------------------- interupt data here, followed closely from class .ktext 0x80000180 .set noat move $k1, $at .set at sw $v0, var7 sw $a0, var8 mfc0 $k0, $13 srl $a0, $k0, 2 andi $a0, $a0, 0x1f bne $a0, $zero, no_io lui $v0, 0xFFFF lw $a2, 4($v0) # keyboard data placed in $a2 no_io: mtc0 $0, $13 mfc0 $k0, $12 andi $k0, 0xfffd ori $k0, 0x11 mtc0 $k0, $12 lw $v0, var7 lw $a0, var8 .set noat move $at, $k1 .set at eret Thanks in advance.

Read the article

Delete Nodes + attributes that match Xpath except specific attributes

- by Ryan Ternier

I'm trying to find the best (efficient) way of doing this. I have a medium sized XML document. Depending on specific settings certain portions of it need to be filtered out for security reasons. I'll be doing this in XSLT as it's configurable and no code should need changing. I've looked around, but not getting much luck on it. For example: I have the following XPath: //*[@root='2.16.840.1.113883.3.51.1.1.6.1'] Whicrooth gives me all nodes with a root attribute equal to a specific OID. In these nodes I want to have all attributes except for a few (ex. foo and bar) erased, and then having another attribute added (ex. reason) I also need to have multiple XPath expressions that can be ran to zero down on a specific node and clear it's contents out in a similar fashion, with respect to nodes with specific attributes. I'm playing around with information from: XPath expression to select all XML child nodes except a specific list? and XSLT Remove Elements and/or Attributes by Name per XSL Parameters Will update shortly when I can have access what what I"ve done so far. Example: XML Before Transformation <root> <childNode> <innerChild root="2.16.840.1.113883.3.51.1.1.6.1" a="b" b="c" type="innerChildness"/> <innerChildSibling/> </childNode> <animals> <cat> <name>bob</name> </cat> </animals> <tree/> <water root="2.16.840.1.113883.3.51.1.1.6.1" z="zed" l="ell" type="liquidLIke"/> </root> After <root> <childNode> <innerChild root="2.16.840.1.113883.3.51.1.1.6.1" flavor="MSK"/>  <innerChildSibling/> </childNode> <animals> <cat flavor="MSK" />  </animals> <tree/> <water root="2.16.840.1.113883.3.51.1.1.6.1" flavor="MSK"/>  </root>

Read the article

delta-dictionary/dictionary with revision awareness in python?

- by shabbychef

I am looking to create a dictionary with 'roll-back' capabilities in python. The dictionary would start with a revision number of 0, and the revision would be bumped up only by explicit method call. I do not need to delete keys, only add and update key,value pairs, and then roll back. I will never need to 'roll forward', that is, when rolling the dictionary back, all the newer revisions can be discarded, and I can start re-reving up again. thus I want behaviour like: >>> rr = rev_dictionary() >>> rr.rev 0 >>> rr["a"] = 17 >>> rr[('b',23)] = 'foo' >>> rr["a"] 17 >>> rr.rev 0 >>> rr.roll_rev() >>> rr.rev 1 >>> rr["a"] 17 >>> rr["a"] = 0 >>> rr["a"] 0 >>> rr[('b',23)] 'foo' >>> rr.roll_to(0) >>> rr.rev 0 >>> rr["a"] 17 >>> rr.roll_to(1) Exception ... Just to be clear, the state associated with a revision is the state of the dictionary just prior to the roll_rev() method call. thus if I can alter the value associated with a key several times 'within' a revision, and only have the last one remembered. I would like a fairly memory-efficient implementation of this: the memory usage should be proportional to the deltas. Thus simply having a list of copies of the dictionary will not scale for my problem. One should assume the keys are in the tens of thousands, and the revisions are in the hundreds of thousands. We can assume the values are immutable, but need not be numeric. For the case where the values are e.g. integers, there is a fairly straightforward implementation (have a list of dictionaries of the numerical delta from revision to revision). I am not sure how to turn this into the general form. Maybe bootstrap the integer version and add on an array of values? all help appreciated.

Read the article

SQL Server 2008: FileStream Insertion Failure w/ .NET 3.5SP1

- by James Alexander

I've configured a db w/ a FileStream group and have a table w/ File type on it. When attempting to insert a streamed file and after I create the table row, my query to read the filepath out and the buffer returns a null file path. I can't seem to figure out why though. Here is the table creation script: /****** Object: Table [dbo].[JobInstanceFile] Script Date: 03/22/2010 18:05:36 ******/ SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO SET ANSI_PADDING ON GO CREATE TABLE [dbo].[JobInstanceFile]( [JobInstanceFileId] [int] IDENTITY(1,1) NOT NULL, [JobInstanceId] [int] NOT NULL, [File] [varbinary](max) FILESTREAM NULL, [FileId] [uniqueidentifier] ROWGUIDCOL NOT NULL, [Created] [datetime] NOT NULL, CONSTRAINT [PK_JobInstanceFile] PRIMARY KEY CLUSTERED ( [JobInstanceFileId] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] FILESTREAM_ON [JobInstanceFilesGroup], UNIQUE NONCLUSTERED ( [FileId] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY] FILESTREAM_ON [JobInstanceFilesGroup] GO SET ANSI_PADDING OFF GO ALTER TABLE [dbo].[JobInstanceFile] ADD DEFAULT (newid()) FOR [FileId] GO Here's my proc I call to create the row before streaming the file: /****** Object: StoredProcedure [dbo].[JobInstanceFileCreate] Script Date: 03/22/2010 18:06:23 ******/ SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO create proc [dbo].[JobInstanceFileCreate] @JobInstanceId int, @Created datetime as insert into JobInstanceFile (JobInstanceId, FileId, Created) values (@JobInstanceId, newid(), @Created) select scope_identity() GO And lastly, here's the code I'm using: public int CreateJobInstanceFile(int jobInstanceId, string filePath) { using (var connection = new SqlConnection(ConfigurationManager.ConnectionStrings["ConsumerMarketingStoreFiles"].ConnectionString)) using (var fileStream = new FileStream(filePath, FileMode.Open)) { connection.Open(); var tran = connection.BeginTransaction(IsolationLevel.ReadCommitted); try { //create the JobInstanceFile instance var command = new SqlCommand("JobInstanceFileCreate", connection) { Transaction = tran }; command.CommandType = CommandType.StoredProcedure; command.Parameters.AddWithValue("@JobInstanceId", jobInstanceId); command.Parameters.AddWithValue("@Created", DateTime.Now); int jobInstanceFileId = Convert.ToInt32(command.ExecuteScalar()); //read out the filestream transaction context to stream the file for storage command.CommandText = "select [File].PathName(), GET_FILESTREAM_TRANSACTION_CONTEXT() from JobInstanceFile where JobInstanceFileId = @JobInstanceFileId"; command.CommandType = CommandType.Text; command.Parameters.AddWithValue("@JobInstanceFileId", jobInstanceFileId); using (SqlDataReader dr = command.ExecuteReader()) { dr.Read(); //get the file path we're writing out to string writePath = dr.GetString(0); using (var writeStream = new SqlFileStream(writePath, (byte[])dr.GetValue(1), FileAccess.ReadWrite)) { //copy from one stream to another byte[] bytes = new byte[65536]; int numBytes; while ((numBytes = fileStream.Read(bytes, 0, 65536)) 0) writeStream.Write(bytes, 0, numBytes); } } tran.Commit(); return jobInstanceFileId; } catch (Exception e) { tran.Rollback(); throw e; } } } Can someone please let me know what I'm doing wrong. In the code, the following expression is returning null for the file path and shouldn't be: //get the file path we're writing out to string writePath = dr.GetString(0); The server is different then the computer the code is running on but the necessary shares appear to be in order and I have also run the following: EXEC sp_configure filestream_access_level, 2 Any help would be greatly appreciated. Thanks!

Read the article

Bazaar newbie question about repository structures

- by esc1729

I want to use Bazaar on Windows XP for web-development and related tasks. Most of the files are edited locally and then transferred via FTP to the server. Just now the repository sits on my local workstation. Later on it should be shared locally with some co-workers. Perhaps we will use a local Linux server as a centralized repository, but this structure is not decided for now. But first I need to understand the impacts of the different repository setups, which I do not at all. Using Bazaar-Explorer on Windows XP I’ve created a ‘shared tree repository’ from the option list of the init-dialogue in some location dev-filter/. Bazaar Explorer tells me: Created repository with treeless branches at F:/bzr.local/dev-filter Created branch at F:/bzr.local/dev-filter/trunk Created working tree at F:/bzr.local/dev-filter/work OK so far. Now I move a bunch of files into the work directory and add and commit them as Rev 1 ‘Start Revision’. Then I work on some of these files and commit them again as Rev 2. Here my confusion starts. Shouldn’t both revisions go into the trunk? The trunk is still empty, beside the .bzr directory which only holds some management information. If I delete my working directory, which I have tried during these first experiments, everything is gone. There’s obviously no hidden storage of those files. OK. Perhaps I need to push it into the trunk? This does not work either. Entering the work/ directory and initializing the ‘push’ to the trunk, Bazaar-Explorer tells me No new revisions to push. So what? This looks like a severe conceptual misunderstanding about what should happen on my side. Edit, 2010-02-03: Some conclusions What I learned meanwhile is this: I think I should switch to the command line until I really understand what’s going on, at least for creating the repositories and branches. Bazaar Explorer introduces a new level of abstraction which I only can handle if I understand the level beneath One of the secrets of working with Bazaar at least for me is to understand those .bzr directories, their particular properties and states when created with ‘bzr init’, ‘bzr init-repository’, ‘bzr branch’ etc. in all their variants and how they are plumped together. While there’s a whole chapter of ‘Organizing your workspace’ in the Bazaar User Guide, it’s more or less workflow oriented. The manual contains a lot of directory structures for the given examples. What I would prefer beside this and have not (or only rudimentary) found so far is some graphical representation of those ‘Lego like’ .bzr building blocks which create the linking of all the parts. So I started to invent some simple notation while working through the examples and looking into the .bzr directories to document what information is stored there, where does it come from, how and to what is it linked, is it complete or shared, etc. Erich Schreiber

Read the article

Append indexed input fields with attached datepicker widget

- by Micor

I am looking for the most efficient way to add a set of fields with datepicker widget attached on click event. So here I have a set of indexed events being generated on a page: <div id="events"> <div class="event"> <input type="text" id="event_0_startDate" name="event[0].startDate" class="date"/> <input type="hidden" id="event_0_startDate_day" name="event[0].startDate_day" /> <input type="hidden" id="event_0_startDate_month" name="event[0].startDate_month" /> <input type="hidden" id="event_0_startDate_year" name="event[0].startDate_year"/> </div> <div class="event"> <input type="text" id="event_1_startDate" name="event[1].startDate" class="date"/> <input type="hidden" id="event_1_startDate_day" name="event[1].startDate_day" /> <input type="hidden" id="event_1_startDate_month" name="event[1].startDate_month" /> <input type="hidden" id="event_1_startDate_year" name="event[1].startDate_year" /> </div> .... more event divs .... </div> <a href="#" id="add_event">Add event</a> .. with datepicker widget attached: $(".date").datepicker({ onClose: function(dateText,picker) { var prefix = $(this).attr('id'); $('#' + prefix + '_month').val( dateText.split(/\//)[0] ); $('#' + prefix + '_day').val( dateText.split(/\//)[1] ); $('#' + prefix + '_year').val( dateText.split(/\//)[2] ); } }); I need a function that will add inside div id="events" after the last div class="event" another div class="event" where N is the next index and attach datepicker widget above to the new input field with class="date" like: <div class="event"> <input type="text" id="event_N_startDate" name="event[N].startDate" class="date"/> <input type="hidden" id="event_N_startDate_day" name="event[N].startDate_day"/> <input type="hidden" id="event_N_startDate_month" name="event[N].startDate_month" /> <input type="hidden" id="event_N_startDate_year" name="event[N].startDate_year" /> </div> Thank you.

Read the article

Need help implementing simple socket server using GIOService (GLib, Glib-GIO)

- by Mark Renouf

I'm learning the basics of writing a simple, efficient socket server using GLib. I'm experimenting with GSocketService. So far I can only seem to accept connections but then they are immediately closed. From the docs I can't figure out what step I am missing. I'm hoping someone can shed some light on this for me. When running the following: # telnet localhost 4000 Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. Connection closed by foreign host. # telnet localhost 4000 Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. Connection closed by foreign host. # telnet localhost 4000 Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. Connection closed by foreign host. Output from the server: # ./server New Connection from 127.0.0.1:36962 New Connection from 127.0.0.1:36963 New Connection from 127.0.0.1:36965 Current code: /* * server.c * * Created on: Mar 10, 2010 * Author: mark */ #include <glib.h> #include <gio/gio.h> gchar *buffer; gboolean network_read(GIOChannel *source, GIOCondition cond, gpointer data) { GString *s = g_string_new(NULL); GError *error; GIOStatus ret = g_io_channel_read_line_string(source, s, NULL, &error); if (ret == G_IO_STATUS_ERROR) g_error ("Error reading: %s\n", error->message); else g_print("Got: %s\n", s->str); } gboolean new_connection(GSocketService *service, GSocketConnection *connection, GObject *source_object, gpointer user_data) { GSocketAddress *sockaddr = g_socket_connection_get_remote_address(connection, NULL); GInetAddress *addr = g_inet_socket_address_get_address(G_INET_SOCKET_ADDRESS(sockaddr)); guint16 port = g_inet_socket_address_get_port(G_INET_SOCKET_ADDRESS(sockaddr)); g_print("New Connection from %s:%d\n", g_inet_address_to_string(addr), port); GSocket *socket = g_socket_connection_get_socket(connection); gint fd = g_socket_get_fd(socket); GIOChannel *channel = g_io_channel_unix_new(fd); g_io_add_watch(channel, G_IO_IN, (GIOFunc) network_read, NULL); return TRUE; } int main(int argc, char **argv) { g_type_init(); GSocketService *service = g_socket_service_new(); GInetAddress *address = g_inet_address_new_from_string("127.0.0.1"); GSocketAddress *socket_address = g_inet_socket_address_new(address, 4000); g_socket_listener_add_address(G_SOCKET_LISTENER(service), socket_address, G_SOCKET_TYPE_STREAM, G_SOCKET_PROTOCOL_TCP, NULL, NULL, NULL); g_object_unref(socket_address); g_object_unref(address); g_socket_service_start(service); g_signal_connect(service, "incoming", G_CALLBACK(new_connection), NULL); GMainLoop *loop = g_main_loop_new(NULL, FALSE); g_main_loop_run(loop); }

Read the article

Python: How to read huge text file into memory

- by asmaier

I'm using Python 2.6 on a Mac Mini with 1GB RAM. I want to read in a huge text file $ ls -l links.csv; file links.csv; tail links.csv -rw-r--r-- 1 user user 469904280 30 Nov 22:42 links.csv links.csv: ASCII text, with CRLF line terminators 4757187,59883 4757187,99822 4757187,66546 4757187,638452 4757187,4627959 4757187,312826 4757187,6143 4757187,6141 4757187,3081726 4757187,58197 So each line in the file consists of a tuple of two comma separated integer values. I want to read in the whole file and sort it according to the second column. I know, that I could do the sorting without reading the whole file into memory. But I thought for a file of 500MB I should still be able to do it in memory since I have 1GB available. However when I try to read in the file, Python seems to allocate a lot more memory than is needed by the file on disk. So even with 1GB of RAM I'm not able to read in the 500MB file into memory. My Python code for reading the file and printing some information about the memory consumption is: #!/usr/bin/python # -*- coding: utf-8 -*- import sys infile=open("links.csv", "r") edges=[] count=0 #count the total number of lines in the file for line in infile: count=count+1 total=count print "Total number of lines: ",total infile.seek(0) count=0 for line in infile: edge=tuple(map(int,line.strip().split(","))) edges.append(edge) count=count+1 # for every million lines print memory consumption if count%1000000==0: print "Position: ", edge print "Read ",float(count)/float(total)*100,"%." mem=sys.getsizeof(edges) for edge in edges: mem=mem+sys.getsizeof(edge) for node in edge: mem=mem+sys.getsizeof(node) print "Memory (Bytes): ", mem The output I got was: Total number of lines: 30609720 Position: (9745, 2994) Read 3.26693612356 %. Memory (Bytes): 64348736 Position: (38857, 103574) Read 6.53387224712 %. Memory (Bytes): 128816320 Position: (83609, 63498) Read 9.80080837067 %. Memory (Bytes): 192553000 Position: (139692, 1078610) Read 13.0677444942 %. Memory (Bytes): 257873392 Position: (205067, 153705) Read 16.3346806178 %. Memory (Bytes): 320107588 Position: (283371, 253064) Read 19.6016167413 %. Memory (Bytes): 385448716 Position: (354601, 377328) Read 22.8685528649 %. Memory (Bytes): 448629828 Position: (441109, 3024112) Read 26.1354889885 %. Memory (Bytes): 512208580 Already after reading only 25% of the 500MB file, Python consumes 500MB. So it seem that storing the content of the file as a list of tuples of ints is not very memory efficient. Is there a better way to do it, so that I can read in my 500MB file into my 1GB of memory?

Read the article

How to crop or get a smaller size UIImage in iPhone without memory leaks?

- by rkbang

Hello all, I am using a navigation controller in which I push a tableview Controller as follows: TableView *Controller = [[TableView alloc] initWithStyle:UITableViewStylePlain]; [self.navigationController pushViewController:Controller animated:NO]; [Controller release]; In this table view I am using following two methods to display images: - (UIImage*) getSmallImage:(UIImage*) img { CGSize size = img.size; CGFloat ratio = 0; if (size.width < size.height) { ratio = 36 / size.width; } else { ratio = 36 / size.height; } CGRect rect = CGRectMake(0.0, 0.0, ratio * size.width, ratio * size.height); UIGraphicsBeginImageContext(rect.size); [img drawInRect:rect]; return UIGraphicsGetImageFromCurrentImageContext(); UIGraphicsEndImageContext(); } - (UIImage*)imageByCropping:(UIImage *)imageToCrop toRect:(CGRect)rect { //create a context to do our clipping in UIGraphicsBeginImageContext(rect.size); CGContextRef currentContext = UIGraphicsGetCurrentContext(); //create a rect with the size we want to crop the image to //the X and Y here are zero so we start at the beginning of our //newly created context CGFloat X = (imageToCrop.size.width - rect.size.width)/2; CGFloat Y = (imageToCrop.size.height - rect.size.height)/2; CGRect clippedRect = CGRectMake(X, Y, rect.size.width, rect.size.height); //CGContextClipToRect( currentContext, clippedRect); //create a rect equivalent to the full size of the image //offset the rect by the X and Y we want to start the crop //from in order to cut off anything before them CGRect drawRect = CGRectMake(0, 0, imageToCrop.size.width, imageToCrop.size.height); CGContextTranslateCTM(currentContext, 0.0, drawRect.size.height); CGContextScaleCTM(currentContext, 1.0, -1.0); //draw the image to our clipped context using our offset rect //CGContextDrawImage(currentContext, drawRect, imageToCrop.CGImage); CGImageRef tmp = CGImageCreateWithImageInRect(imageToCrop.CGImage, clippedRect); //pull the image from our cropped context UIImage *cropped = [UIImage imageWithCGImage:tmp];//UIGraphicsGetImageFromCurrentImageContext(); CGImageRelease(tmp); //pop the context to get back to the default UIGraphicsEndImageContext(); //Note: this is autoreleased*/ return cropped; } But when I pop the Controller, the memory being used is not released. Is there any leaks in the above code used to create and crop images. Also are there any efficient method to deal with images in iPhone. I am having a lot of images and facing major challeges in resolving the memory issues. tnx in advance.

Read the article

Best way to handle multiple tables to replace one big table in Rails? (e.g. 'Books1', 'Books2', etc.

- by mikep

Hello, I've decided to use multiple tables for an entity (e.g. Books1, Books2, Books3, etc.), instead of just one main table which could end up having a lot of rows (e.g. just Books). I'm doing this to try and to avoid a potential future performance drop that could come with having too many rows in one table. With that, I'm looking for a good way to handle this in Rails, mainly by trying to avoid loading a bunch of unused associations. (I know that I could use a partition for this, but, for now, I've decided to go the 'multiple tables' route.) Each user has their books placed into a specific table. The actual book table is chosen when the user is created, and all of their books go into the same table. I'm going to split the adds across the tables. The goal is to try and keep each table pretty much even -- but that's a different issue. One thing I don't particularly want to have is a bunch of unused associations in the User class. Right now, it looks like I'd have to do the following: class User < ActiveRecord::Base has_many :books1, :books2, :books3, :books4, :books5 end class Books1 < ActiveRecord::Base belongs_to :user end class Books2 < ActiveRecord::Base belongs_to :user end class Books3 < ActiveRecord::Base belongs_to :user end I'm assuming that the main performance hit would come in terms of memory and possibly some method call overhead for each User object, since it has to load all of those associations, which in turn creates all of those nice, dynamic model accessor methods like User.find_by_. But for each specific user, only one of the book tables would be usable/applicable, since all of a user's books are stored in the same table. So, only one of the associations would be in use at any time and any other has_many :bookX association that was loaded would be a waste. For example, with a user.id of 2, I'd only need books3.find_by_author('Author'), but the way I'm thinking of setting this up, I'd still have access to Books1..n. I don't really know Ruby/Rails does internally with all of those has_many associations though, so maybe it's not so bad. But right now I'm thinking that it's really wasteful, and that there may just be a better, more efficient way of doing this. So, a few questions: 1) Is there's some sort of special Ruby/Rails methodology that could be applied to this 'multiple tables to represent one entity' scheme? Are there any 'best practices' for this? 2) Is it really bad to have so many unused has_many associations for each object? Is there a better way to do this? 3) Does anyone have any advice on how to abstract the fact that there's multiple book tables behind a single books model/class? For example, so I can call books.find_by_author('Author') instead of books3.find_by_author('Author'). Thank you!

Read the article

Overriding LINQ extension methods

- by Ruben Vermeersch

Is there a way to override extension methods (provide a better implementation), without explicitly having to cast to them? I'm implementing a data type that is able to handle certain operations more efficiently than the default extension methods, but I'd like to keep the generality of IEnumerable. That way any IEnumerable can be passed, but when my class is passed in, it should be more efficient. As a toy example, consider the following: // Compile: dmcs -out:test.exe test.cs using System; namespace Test { public interface IBoat { void Float (); } public class NiceBoat : IBoat { public void Float () { Console.WriteLine ("NiceBoat floating!"); } } public class NicerBoat : IBoat { public void Float () { Console.WriteLine ("NicerBoat floating!"); } public void BlowHorn () { Console.WriteLine ("NicerBoat: TOOOOOT!"); } } public static class BoatExtensions { public static void BlowHorn (this IBoat boat) { Console.WriteLine ("Patched on horn for {0}: TWEET", boat.GetType().Name); } } public class TestApp { static void Main (string [] args) { IBoat niceboat = new NiceBoat (); IBoat nicerboat = new NicerBoat (); Console.WriteLine ("## Both should float:"); niceboat.Float (); nicerboat.Float (); // Output: // NiceBoat floating! // NicerBoat floating! Console.WriteLine (); Console.WriteLine ("## One has an awesome horn:"); niceboat.BlowHorn (); nicerboat.BlowHorn (); // Output: // Patched on horn for NiceBoat: TWEET // Patched on horn for NicerBoat: TWEET Console.WriteLine (); Console.WriteLine ("## That didn't work, but it does when we cast:"); (niceboat as NiceBoat).BlowHorn (); (nicerboat as NicerBoat).BlowHorn (); // Output: // Patched on horn for NiceBoat: TWEET // NicerBoat: TOOOOOT! Console.WriteLine (); Console.WriteLine ("## Problem is: I don't always know the type of the objects."); Console.WriteLine ("## How can I make it use the class objects when the are"); Console.WriteLine ("## implemented and extension methods when they are not,"); Console.WriteLine ("## without having to explicitely cast?"); } } } Is there a way to get the behavior from the second case, without explict casting? Can this problem be avoided?

Read the article

print a linear linked list into a table

- by user1796970

I am attempting to print some values i have stored into a LLL into a readable table. The data i have stored is the following : DEBBIE STARR F 3 W 1000.00 JOAN JACOBUS F 9 W 925.00 DAVID RENN M 3 H 4.75 ALBERT CAHANA M 3 H 18.75 DOUGLAS SHEER M 5 W 250.00 SHARI BUCHMAN F 9 W 325.00 SARA JONES F 1 H 7.50 RICKY MOFSEN M 6 H 12.50 JEAN BRENNAN F 6 H 5.40 JAMIE MICHAELS F 8 W 150.00 i have stored each firstname, lastname, gender, tenure, payrate, and salary into their own List. And would like to be able to print them out in the same format that they are viewed on the text file i read them in from. i have messed around with a few methods that allow me to traverse and print the Lists, but i end up with ugly output. . . here is my code for the storage of the text file and the format i would like to print out: public class Payroll { private LineWriter lw; private ObjectList output; ListNode input; private ObjectList firstname, lastname, gender, tenure, rate, salary; public Payroll(LineWriter lw) { this.lw = lw; this.firstname = new ObjectList(); this.lastname = new ObjectList(); this.gender = new ObjectList(); this.tenure = new ObjectList(); this.rate = new ObjectList(); this.salary = new ObjectList(); this.output = new ObjectList(); this.input = new ListNode(); } public void readfile() { File file = new File("payfile.txt"); try{ Scanner scanner = new Scanner(file); while(scanner.hasNextLine()) { String line = scanner.nextLine(); Scanner lineScanner = new Scanner(line); lineScanner.useDelimiter("\\s+"); while(lineScanner.hasNext()) { firstname.insert1(lineScanner.next()); lastname.insert1(lineScanner.next()); gender.insert1(lineScanner.next()); tenure.insert1(lineScanner.next()); rate.insert1(lineScanner.next()); salary.insert1(lineScanner.next()); } } }catch(FileNotFoundException e) {e.printStackTrace();} } public void printer(LineWriter lw) { String msg = " FirstName " + " LastName " + " Gender " + " Tenure " + " Pay Rate " + " Salary "; output.insert1(msg); System.out.println(output.getFirst()); System.out.println(" " + firstname.getFirst() + " " + lastname.getFirst() + "\t" + gender.getFirst() + "\t" + tenure.getFirst() + "\t" + rate.getFirst() + "\t" + salary.getFirst()); } }

Read the article

Fulltext search for django : Mysql not so bad ? (vs sphinx, xapian)

- by Eric

I am studying fulltext search engines for django. It must be simple to install, fast indexing, fast index update, not blocking while indexing, fast search. After reading many web pages, I put in short list : Mysql MYISAM fulltext, djapian/python-xapian, and django-sphinx I did not choose lucene because it seems complex, nor haystack as it has less features than djapian/django-sphinx (like fields weighting). Then I made some benchmarks, to do so, I collected many free books on the net to generate a database table with 1 485 000 records (id,title,body), each record is about 600 bytes long. From the database, I also generated a list of 100 000 existing words and shuffled them to create a search list. For the tests, I made 2 runs on my laptop (4Go RAM, Dual core 2.0Ghz): the first one, just after a server reboot to clear all caches, the second is done juste after in order to test how good are cached results. Here are the "home made" benchmark results : 1485000 records with Title (150 bytes) and body (450 bytes) Mysql 5.0.75/Ubuntu 9.04 Fulltext : ========================================================================== Full indexing : 7m14.146s 1 thread, 1000 searchs with single word randomly taken from database : First run : 0:01:11.553524 next run : 0:00:00.168508 Mysql 5.5.4 m3/Ubuntu 9.04 Fulltext : ========================================================================== Full indexing : 6m08.154s 1 thread, 1000 searchs with single word randomly taken from database : First run : 0:01:11.553524 next run : 0:00:00.168508 1 thread, 100000 searchs with single word randomly taken from database : First run : 9m09s next run : 5m38s 1 thread, 10000 random strings (random strings should not be found in database) : just after the 100000 search test : 0:00:15.007353 1 thread, boolean search : 1000 x (+word1 +word2) First run : 0:00:21.205404 next run : 0:00:00.145098 Djapian Fulltext : ========================================================================== Full indexing : 84m7.601s 1 thread, 1000 searchs with single word randomly taken from database with prefetch : First run : 0:02:28.085680 next run : 0:00:14.300236 python-xapian Fulltext : ========================================================================== 1 thread, 1000 searchs with single word randomly taken from database : First run : 0:01:26.402084 next run : 0:00:00.695092 django-sphinx Fulltext : ========================================================================== Full indexing : 1m25.957s 1 thread, 1000 searchs with single word randomly taken from database : First run : 0:01:30.073001 next run : 0:00:05.203294 1 thread, 100000 searchs with single word randomly taken from database : First run : 12m48s next run : 9m45s 1 thread, 10000 random strings (random strings should not be found in database) : just after the 100000 search test : 0:00:23.535319 1 thread, boolean search : 1000 x (word1 word2) First run : 0:00:20.856486 next run : 0:00:03.005416 As you can see, Mysql is not so bad at all for fulltext search. In addition, its query cache is very efficient. Mysql seems to me a good choice as there is nothing to install (I need just to write a small script to synchronize an Innodb production table to a MyISAM search table) and as I do not really need advanced search feature like stemming etc... Here is the question : What do you think about Mysql fulltext search engine vs sphinx and xapian ?

Read the article

NSXMLParser Memory Allocation Efficiency for the iPhone

- by Staros

Hello, I've recently been playing with code for an iPhone app to parse XML. Sticking to Cocoa, I decided to go with the NSXMLParser class. The app will be responsible for parsing 10,000+ "computers", all which contain 6 other strings of information. For my test, I've verified that the XML is around 900k-1MB in size. My data model is to keep each computer in an NSDictionary hashed by a unique identifier. Each computer is also represented by a NSDictionary with the information. So at the end of the day, I end up with a NSDictionary containing 10k other NSDictionaries. The problem I'm running into isn't about leaking memory or efficient data structure storage. When my parser is done, the total amount of allocated objects only does go up by about 1MB. The problem is that while the NSXMLParser is running, my object allocation is jumping up as much as 13MB. I could understand 2 (one for the object I'm creating and one for the raw NSData) plus a little room to work, but 13 seems a bit high. I can't imaging that NSXMLParser is that inefficient. Thoughts? Code... The code to start parsing... NSXMLParser *parser = [[NSXMLParser alloc] initWithData: data]; [parser setDelegate:dictParser]; [parser parse]; output = [[dictParser returnDictionary] retain]; [parser release]; [dictParser release]; And the parser's delegate code... -(void)parser:(NSXMLParser *)parser didStartElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qualifiedName attributes:(NSDictionary *)attributeDict { if(mutableString) { [mutableString release]; mutableString = nil; } mutableString = [[NSMutableString alloc] init]; } -(void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)string { if(self.mutableString) { [self.mutableString appendString:string]; } } -(void)parser:(NSXMLParser *)parser didEndElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName { if([elementName isEqualToString:@"size"]){ //The initial key, tells me how many computers returnDictionary = [[NSMutableDictionary alloc] initWithCapacity:[mutableString intValue]]; } if([elementName isEqualToString:hashBy]){ //The unique identifier if(mutableDictionary){ [mutableDictionary release]; mutableDictionary = nil; } mutableDictionary = [[NSMutableDictionary alloc] initWithCapacity:6]; [returnDictionary setObject:[NSDictionary dictionaryWithDictionary:mutableDictionary] forKey:[NSMutableString stringWithString:mutableString]]; } if([fields containsObject:elementName]){ //Any of the elements from a single computer that I am looking for [mutableDictionary setObject:mutableString forKey:elementName]; } } Everything initialized and released correctly. Again, I'm not getting errors or leaking. Just inefficient. Thanks for any thoughts!

Read the article

How to optimize Core Data query for full text search

- by dk

Can I optimize a Core Data query when searching for matching words in a text? (This question also pertains to the wisdom of custom SQL versus Core Data on an iPhone.) I'm working on a new (iPhone) app that is a handheld reference tool for a scientific database. The main interface is a standard searchable table view and I want as-you-type response as the user types new words. Words matches must be prefixes of words in the text. The text is composed of 100,000s of words. In my prototype I coded SQL directly. I created a separate "words" table containing every word in the text fields of the main entity. I indexed words and performed searches along the lines of SELECT id, * FROM textTable JOIN (SELECT DISTINCT textTableId FROM words WHERE word BETWEEN 'foo' AND 'fooz' ) ON id=textTableId LIMIT 50 This runs very fast. Using an IN would probably work just as well, i.e. SELECT * FROM textTable WHERE id IN (SELECT textTableId FROM words WHERE word BETWEEN 'foo' AND 'fooz' ) LIMIT 50 The LIMIT is crucial and allows me to display results quickly. I notify the user that there are too many to display if the limit is reached. This is kludgy. I've spent the last several days pondering the advantages of moving to Core Data, but I worry about the lack of control in the schema, indexing, and querying for an important query. Theoretically an NSPredicate of textField MATCHES '.*\bfoo.*' would just work, but I'm sure it will be slow. This sort of text search seems so common that I wonder what is the usual attack? Would you create a words entity as I did above and use a predicate of "word BEGINSWITH 'foo'"? Will that work as fast as my prototype? Will Core Data automatically create the right indexes? I can't find any explicit means of advising the persistent store about indexes. I see some nice advantages of Core Data in my iPhone app. The faulting and other memory considerations allow for efficient database retrievals for tableview queries without setting arbitrary limits. The object graph management allows me to easily traverse entities without writing lots of SQL. Migration features will be nice in the future. On the other hand, in a limited resource environment (iPhone) I worry that an automatically generated database will be bloated with metadata, unnecessary inverse relationships, inefficient attribute datatypes, etc. Should I dive in or proceed with caution?

Read the article

"Wrapping" a BindingList<T> propertry with a List<T> property for serialization.

- by Eric

I'm writing an app that allows users search and browse catalogs of widgets. My WidgetCatalog class is serialized and deserialized to and from XML files using DataContractSerializer. My app is working now but I think I can make the code a lot more efficient if I started taking advantage of data binding rather then doing everything manually. Here's a stripped down version of my current WidgetCatalog class. [DataContract(Name = "WidgetCatalog")] class WidgetCatalog { [DataContract(Name = "Name")] public string Name { get; set; } [DataContract(Name = "Widgets")] public List<Widget> Widgets { get; set; } } I had to write a lot of extra code to keep my UI in sync when widgets are added or removed from a catalog, or their internal properties change. I'm pretty inexperienced with data-binding, but I think I want a BindingList<Widget> rather than a plain old List<Widget>. Is this right? In the past when I had similar needs I found that BindingList<T> does not serialize very well. Or at least the Event Handers between the items and the list are not serialized. I was using XmlSerializer though, and DataContractSerializer may work better. So I'm thinking of doing something like the code below. [DataContract(Name = "WidgetCatalog")] class WidgetCatalog { [DataMember(Name = "Name")] public string Name { get; set; } [DataMember(Name = "Widgets")] private List<Widget> WidgetSerializationList { get { return this._widgetBindingList.ToList<Widget>(); } set { this._widgetBindingList = new BindingList<Widget>(value); } } //these do not get serialized private BindingList<Widget> _widgetBindingList; public BindingList<Widget> WidgetBindingList { get { return this._widgetBindingList; } } public WidgetCatalog() { this.WidgetSerializationList = new List<Widget>(); } } So I'm serializing a private List<Widget> property, but the GET and SET accessors of the property are reading from, and writing to theBindingList<Widget> property. Will this even work? It seems like there should be a better way to do this.

Read the article

Creating thousands of records in Rails

- by willCosgrove

Let me set the stage: My application deals with gift cards. When we create cards they have to have a unique string that the user can use to redeem it with. So when someone orders our gift cards, like a retailer, we need to make a lot of new card objects and store them in the DB. With that in mind, I'm trying to see how quickly I can have my application generate 100,000 Cards. Database expert, I am not, so I need someone to explain this little phenomena: When I create 1000 Cards, it takes 5 seconds. When I create 100,000 cards it should take 500 seconds right? Now I know what you're wanting to see, the card creation method I'm using, because the first assumption would be that it's getting slower because it's checking the uniqueness of a bunch of cards, more as it goes along. But I can show you my rake task desc "Creates cards for a retailer" task :order_cards, [:number_of_cards, :value, :retailer_name] => :environment do |t, args| t = Time.now puts "Searching for retailer" @retailer = Retailer.find_by_name(args[:retailer_name]) puts "Retailer found" puts "Generating codes" value = args[:value].to_i number_of_cards = args[:number_of_cards].to_i codes = [] top_off_codes(codes, number_of_cards) while codes != codes.uniq codes.uniq! top_off_codes(codes, number_of_cards) end stored_codes = Card.all.collect do |c| c.code end while codes != (codes - stored_codes) codes -= stored_codes top_off_codes(codes, number_of_cards) end puts "Codes are unique and generated" puts "Creating bundle" @bundle = @retailer.bundles.create!(:value => value) puts "Bundle created" puts "Creating cards" @bundle.transaction do codes.each do |code| @bundle.cards.create!(:code => code) end end puts "Cards generated in #{Time.now - t}s" end def top_off_codes(codes, intended_number) (intended_number - codes.size).times do codes << ReadableRandom.get(CODE_LENGTH) end end I'm using a gem called readable_random for the unique code. So if you read through all of that code, you'll see that it does all of it's uniqueness testing before it ever starts creating cards. It also writes status updates to the screen while it's running, and it always sits for a while at creating. Meanwhile it flies through the uniqueness tests. So my question to the stackoverflow community is: Why is my database slowing down as I add more cards? Why is this not a linear function in regards to time per card? I'm sure the answer is simple and I'm just a moron who knows nothing about data storage. And if anyone has any suggestions, how would you optimize this method, and how fast do you think you could get it to create 100,000 cards? (When I plotted out my times on a graph and did a quick curve fit to get my line formula, I calculated how long it would take to create 100,000 cards with my current code and it says 5.5 hours. That maybe completely wrong, I'm not sure. But if it stays on the line I curve fitted, it would be right around there.)

Read the article

NoHostAvailableException With Cassandra & DataStax Java Driver If Large ResultSet

- by hughj

The setup: 2-node Cassandra 1.2.6 cluster replicas=2 very large CQL3 table with no secondary index Rowkey is a UUID.randomUUID().toString() read consistency set to ONE Using DataStax java driver 1.0 The request: Attempting to do a table scan by "SELECT some-col from schema.table LIMIT nnn;" The fail: Once I go beyond a certain nnn LIMIT, I start to get NoHostAvailableExceptions from the driver. It reads like this: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /10.181.13.239 ([/10.181.13.239] Unexpected exception triggered)) at com.datastax.driver.core.exceptions.NoHostAvailableException.copy(NoHostAvailableException.java:64) at com.datastax.driver.core.ResultSetFuture.extractCauseFromExecutionException(ResultSetFuture.java:214) at com.datastax.driver.core.ResultSetFuture.getUninterruptibly(ResultSetFuture.java:169) at com.jpmc.es.rtm.storage.impl.EventExtract.main(EventExtract.java:36) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120) Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /10.181.13.239 ([/10.181.13.239] Unexpected exception triggered)) at com.datastax.driver.core.RequestHandler.sendRequest(RequestHandler.java:98) at com.datastax.driver.core.RequestHandler$1.run(RequestHandler.java:165) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) Given: This is probably not the most enlightened thing to do to a large table with millions of rows, but this is how I learn what not to do, so I would really appreciate someone who could volunteer how this kind of error can be debugged. For example, when this happens, there are no indications that the nodes in the cluster ever had an issue with the request (there is nothing in the logs on either node that indicate any timeout or failure). Also, I enabled the trace on the driver, which gives you some nice autotrace (ala Oracle) info as long as the query succeeds. But in this case, the driver blows a NoHostAvailableException and no ExecutionInfo is available, so tracing has not provided any benefit in this case. I also find it interesting that this does not seem to be recorded as a timeout (my JMX consoles tell me no timeouts have occurred). So, I am left not understanding WHERE the failure is actually occurring. I am left with the idea that it is the driver that is having a problem, but I don't know how to debug it (and I would really like to). I have read several posts from folks that state that query'g for resultSets 10000 rows is probably not a good idea, and I am willing to accept this, but I would like to understand what is causing the exception and where the exception is happening. FWIW, I also tried bumping the timeout properties in the cassandra.yaml, but this made no difference whatsoever. I welcome any suggestions, anecdotes, insults, or monetary contributions for my registration in the house of moron-developers. Regards!!

Read the article

Fixed point math in c#?

- by x4000

Hi there, I was wondering if anyone here knows of any good resources for fixed point math in c#? I've seen things like this (http://2ddev.72dpiarmy.com/viewtopic.php?id=156) and this (http://stackoverflow.com/questions/79677/whats-the-best-way-to-do-fixed-point-math), and a number of discussions about whether decimal is really fixed point or actually floating point (update: responders have confirmed that it's definitely floating point), but I haven't seen a solid C# library for things like calculating cosine and sine. My needs are simple -- I need the basic operators, plus cosine, sine, arctan2, PI... I think that's about it. Maybe sqrt. I'm programming a 2D RTS game, which I have largely working, but the unit movement when using floating-point math (doubles) has very small inaccuracies over time (10-30 minutes) across multiple machines, leading to desyncs. This is presently only between a 32 bit OS and a 64 bit OS, all the 32 bit machines seem to stay in sync without issue, which is what makes me think this is a floating point issue. I was aware from this as a possible issue from the outset, and so have limited my use of non-integer position math as much as possible, but for smooth diagonal movement at varying speeds I'm calculating the angle between points in radians, then getting the x and y components of movement with sin and cos. That's the main issue. I'm also doing some calculations for line segment intersections, line-circle intersections, circle-rect intersections, etc, that also probably need to move from floating-point to fixed-point to avoid cross-machine issues. If there's something open source in Java or VB or another comparable language, I could probably convert the code for my uses. The main priority for me is accuracy, although I'd like as little speed loss over present performance as possible. This whole fixed point math thing is very new to me, and I'm surprised by how little practical information on it there is on google -- most stuff seems to be either theory or dense C++ header files. Anything you could do to point me in the right direction is much appreciated; if I can get this working, I plan to open-source the math functions I put together so that there will be a resource for other C# programmers out there. UPDATE: I could definitely make a cosine/sine lookup table work for my purposes, but I don't think that would work for arctan2, since I'd need to generate a table with about 64,000x64,000 entries (yikes). If you know any programmatic explanations of efficient ways to calculate things like arctan2, that would be awesome. My math background is all right, but the advanced formulas and traditional math notation are very difficult for me to translate into code.

Read the article

stdio's remove() not always deleting on time.

- by Kyte

For a particular piece of homework, I'm implementing a basic data storage system using sequential files under standard C, which cannot load more than 1 record at a time. So, the basic part is creating a new file where the results of whatever we do with the original records are stored. The previous file's renamed, and a new one under the working name is created. The code's compiled with MinGW 5.1.6 on Windows 7. Problem is, this particular version of the code (I've got nearly-identical versions of this floating around my functions) doesn't always remove the old file, so the rename fails and hence the stored data gets wiped by the fopen(). FILE *archivo, *antiguo; remove("IndiceNecesidades.old"); // This randomly fails to work in time. rename("IndiceNecesidades.dat", "IndiceNecesidades.old"); // So rename() fails. antiguo = fopen("IndiceNecesidades.old", "rb"); // But apparently it still gets deleted, since this turns out null (and I never find the .old in my working folder after the program's done). archivo = fopen("IndiceNecesidades.dat", "wb"); // And here the data gets wiped. Basically, anytime the .old previously exists, there's a chance it's not removed in time for the rename() to take effect successfully. No possible name conflicts both internally and externally. The weird thing's that it's only with this particular file. Identical snippets except with the name changed to Necesidades.dat (which happen in 3 different functions) work perfectly fine. // I'm yet to see this snippet fail. FILE *antiguo, *archivo; remove("Necesidades.old"); rename("Necesidades.dat", "Necesidades.old"); antiguo = fopen("Necesidades.old", "rb"); archivo = fopen("Necesidades.dat", "wb"); Any ideas on why would this happen, and/or how can I ensure the remove() command has taken effect by the time rename() is executed? (I thought of just using a while loop to force call remove() again so long as fopen() returns a non-null pointer, but that sounds like begging for a crash due to overflowing the OS with delete requests or something.)

Read the article

Comparing two large sets of attributes

- by andyashton

Suppose you have a Django view that has two functions: The first function renders some XML using a XSLT stylesheet and produces a div with 1000 subelements like this: <div id="myText"> <p id="p1"><a class="note-p1" href="#" style="display:none" target="bot">?</a></strong>Lorem ipsum</p> <p id="p2"><a class="note-p2" href="#" style="display:none" target="bot">?</a></strong>Foo bar</p> <p id="p3"><a class="note-p3" href="#" style="display:none" target="bot">?</a></strong>Chocolate peanut butter</p> (etc for 1000 lines) <p id="p1000"><a class="note-p1000" href="#" style="display:none" target="bot">?</a></strong>Go Yankees!</p> </div> The second function renders another XML document using another stylesheet to produce a div like this: <div id="myNotes"> <p id="n1"><cite class="note-p1"><sup>1</sup><span>Trololo</span></cite></p> <p id="n2"><cite class="note-p1"><sup>2</sup><span>Trololo</span></cite></p> <p id="n3"><cite class="note-p2"><sup>3</sup><span>lololo</span></cite></p> (etc for n lines) <p id="n"><cite class="note-p885"><sup>n</sup><span>lololo</span></cite></p> </div> I need to see which elements in #myText have classes that match elements in #myNotes, and display them. I can do this using the following jQuery: $('#myText').find('a').each(function() { var $anchor = $(this); $('#myNotes').find('cite').each(function() { if($(this).attr('class') == $anchor.attr('class')) { $anchor.show(); }); }); However this is incredibly slow and inefficient for a large number of comparisons. What is the fastest/most efficient way to do this - is there a jQuery/js method that is reasonable for a large number of items? Or do I need to reengineer the Django code to do the work before passing it to the template?

Read the article

Data access pattern

- by andlju

I need some advice on what kind of pattern(s) I should use for pushing/pulling data into my application. I'm writing a rule-engine that needs to hold quite a large amount of data in-memory in order to be efficient enough. I have some rather conflicting requirements; It is not acceptable for the engine to always have to wait for a full pre-load of all data before it is functional. Only fetching and caching data on-demand will lead to the engine taking too long before it is running quickly enough. An external event can trigger the need for specific parts of the data to be reloaded. Basically, I think I need a combination of pushing and pulling data into the application. A simplified version of my current "pattern" looks like this (in psuedo-C# written in notepad): // This interface is implemented by all classes that needs the data interface IDataSubscriber { void RegisterData(Entity data); } // This interface is implemented by the data access class interface IDataProvider { void EnsureLoaded(Key dataKey); void RegisterSubscriber(IDataSubscriber subscriber); } class MyClassThatNeedsData : IDataSubscriber { IDataProvider _provider; MyClassThatNeedsData(IDataProvider provider) { _provider = provider; _provider.RegisterSubscriber(this); } public void RegisterData(Entity data) { // Save data for later StoreDataInCache(data); } void UseData(Key key) { // Make sure that the data has been stored in cache _provider.EnsureLoaded(key); Entity data = GetDataFromCache(key); } } class MyDataProvider : IDataProvider { List<IDataSubscriber> _subscribers; // Make sure that the data for key has been loaded to all subscribers public void EnsureLoaded(Key key) { if (HasKeyBeenMarkedAsLoaded(key)) return; PublishDataToSubscribers(key); MarkKeyAsLoaded(key); } // Force all subscribers to get a new version of the data for key public void ForceReload(Key key) { PublishDataToSubscribers(key); MarkKeyAsLoaded(key); } void PublishDataToSubscribers(Key key) { Entity data = FetchDataFromStore(key); foreach(var subscriber in _subscribers) { subscriber.RegisterData(data); } } } // This class will be spun off on startup and should make sure that all data is // preloaded as quickly as possible class MyPreloadingThread { IDataProvider _provider; MyPreloadingThread(IDataProvider provider) { _provider = provider; } void RunInBackground() { IEnumerable<Key> allKeys = GetAllKeys(); foreach(var key in allKeys) { _provider.EnsureLoaded(key); } } } I have a feeling though that this is not necessarily the best way of doing this.. Just the fact that explaining it seems to take two pages feels like an indication.. Any ideas? Any patterns out there I should have a look at?

Search Results

Search found 9017 results on 361 pages for 'efficient storage'.

Page 339/361 | < Previous Page | 335 336 337 338 339 340 341 342 343 344 345 346 | Next Page >

- by user1209734

- by Dave

- by belvoir

- by pred2040

- by Ryan Ternier

- by shabbychef

- by James Alexander

- by esc1729

- by Micor

- by Mark Renouf

- by asmaier

- by rkbang

- by mikep

- by Ruben Vermeersch

- by user1796970

- by Eric

- by Staros

- by dk

- by Eric

- by willCosgrove

- by hughj

- by x4000

- by Kyte

- by andyashton

- by andlju

< Previous Page | 335 336 337 338 339 340 341 342 343 344 345 346 | Next Page >