Search Results

Search found 63386 results on 2536 pages for 'data structure'.

Page 445/2536 | < Previous Page | 441 442 443 444 445 446 447 448 449 450 451 452  | Next Page >

  • How do I write raw binary data in Python?

    - by Chris B.
    I've got a Python program that stores and writes data to a file. The data is raw binary data, stored internally as str. I'm writing it out through a utf-8 codec. However, I get UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 25: character maps to <undefined> in the cp1252.py file. This looks to me like Python is trying to interpret the data using the default code page. But it doesn't have a default code page. That's why I'm using str, not unicode. I guess my questions are: How do I represent raw binary data in memory, in Python? When I'm writing raw binary data out through a codec, how do I encode/unencode it?

    Read the article

  • What are the repercussions of not checking existing data when adding a foreign key?

    - by scottm
    I've inherited a database that doesn't exactly strive for data integrity. I am trying to add some foreign keys to change that, but there is data in some tables that doesn't fit the constraints. Most likely, the data won't be used again so I want to know what problems I might face by leaving it there. The other option I see is to move it into some kind of table without referential constraints, just for historical purposes. So, what are the repercussions of not checking existing data? If I create a foreign key constraint on a table and don't check existing data, will all new data inserted into the table be enforced?

    Read the article

  • Python 3-compatibe HTML to text converter preserving basic structure under permissive licence?

    - by hawk64
    I am looking for a relatively simple HTML to text converter which displays links and works on strings. So far I have tried lynx but performance is too bad, html2text which gives weird and verbose markdown output and is under GPLv3 which is too restrictive for my (BSD-licensed) project, http://effbot.org/librarybook/formatter-example-3.py using htmllib.HTMLParser with formatter.AbstractFormatter and a custom writer, however htmllib.HTMLParser is drpeceated and has been removed from Python 3. So is there any simple, performant, Python 3-compatible HTML to text converter under a permissive license such as MIT/BSD/Apache and the like? Edit: I dont just need something to strip HTML-Tags but also to preserve the basic structure of the HTML, that is output that somewhat resembles that of Lynx.

    Read the article

  • Does any language have a while-else flow structure?

    - by dotancohen
    Consider this flow structure which I happen to use often: if ( hasPosts() ) { while ( hasPosts() ) { displayNextPost(); } } else { displayNoPostsContent(); } Are there any programming languages which have an optional else clause for while, which is to be run if the while loop is never entered? Thus, the code above would become: while ( hasPosts() ) { displayNextPost(); } else { displayNoPostsContent(); } I find it interesting that many languages have the do-while construct (run the while code once before checking the condition) yet I have never seen while-else addressed. There is precedent for running an N block of code based on what was run in N-1 block, such as the try-catch construct. I wasn't sure whether to post here or on programmers.SE. If this question is more appropriate there, then please move it. Thanks.

    Read the article

  • ManyToManyField error when having recursive structure. How to solve it?

    - by luc
    Hello, I have the following table in the model with a recursive structure (a page can have children pages) class DynamicPage(models.Model): name = models.CharField("Titre",max_length=200) parent = models.ForeignKey('self',null=True,blank=True) I want to create another table with manytomany relation with this one: class UserMessage(models.Model): name = models.CharField("Nom", max_length=100) page = models.ManyToManyField(DynamicPage) The generated SQL creates the following constraint: ALTER TABLE `website_dynamicpage` ADD CONSTRAINT `parent_id_refs_id_29c58e1b` FOREIGN KEY (`parent_id`) REFERENCES `website_dynamicpage` (`id`); I would like to have the ManyToMany with the page itself (the id) and not with the parent field. How to modify the model to make the constraint using the id and not the parent? Thanks in advance

    Read the article

  • Want to save data field from form into two columns of two models.

    - by vette982
    I have a Profile model with a hasOne relationship to a Detail model. I have a registration form that saves data into both model's tables, but I want the username field from the profile model to be copied over to the usernamefield in the details model so that each has the same username. function new_account() { if(!empty($this->data)) { $this->Profile->modified = date("Y-m-d H:i:s"); if($this->Profile->save($this->data)) { $this->data['Detail']['profile_id'] = $this->Profile->id; $this->data['Detail']['username'] = $this->Profile->username; $this->Profile->Detail->save($this->data); $this->Session->setFlash('Your registration was successful.'); $this->redirect(array('action'=>'index')); } } } This code in my Profile controller gives me the error: Undefined property: Profile::$username Any ideas?

    Read the article

  • What Should be the Structure of a C++ Project?

    - by Ell
    I have recently started learning C++ and coming from a Ruby environment I have found it very hard to structure a project in a way that it still compiles correctly, I have been using Code::Blocks which is brilliant but a downside is that when I add a new header file or c++ source file, it will generate some code and even though it is only a mere 3 or 4 lines, I do not know what these lines do. First of all I would like to ask this question: What do these lines do? #ifndef TEXTGAME_H_INCLUDED #define TEXTGAME_H_INCLUDED #endif // TEXTGAME_H_INCLUDED My second question is, do I need to #include both the .h file and the .cpp file, and in which order. My third question is where can I find the GNU GCC Compiler that, I beleive, was packaged with Code::Blocks and how do I use it without Code::Blocks? I would rather develop in a notepad++ sort of way because that is what I'm used to in Ruby but since C++ is compiled, you may think differently (please give advice and views on that as well) Thanks in advance, ell.

    Read the article

  • How to structure opening hours of a place? May-be there's even an ontology for it?

    - by Ago
    Does anyone know any ontology which specifies opening hours of places? For example, I have a museum, which has 2 seasons. For low season (season start and end is specified), it is opened 10.00 - 18.00 on weekdays and 10-16 on saturday (on sunday it's closed), for high season it's opened 10-20 on weekdays and 10-18 on weekend. If there is no ontology, may-be people have experience, how best to structure information like that? I'm describing information in RDF. But whatever comments are welcome (even if you have relational database which holds given data). Thanks

    Read the article

  • What happens if the first part of an if-structure is false?

    - by djerry
    Hey guys, I was wondering what happens when a program processes an if-structure with multiple conditions. I have an idea, but i'm not sure about it. I'll give an example : List<string> myTestList = null; if (myTestList != null && myTestList.Count > 0) { //process } The list is null. When processing the if, will it go from left to right exiting the if as soon as one condition is false? I've tried it and seems to throw no errors, so i assume the above explains it, but i'm not sure. Thanks in advance.

    Read the article

  • What's the difference between initializing this structure with these strategies?

    - by mystify
    // the malloc style, which returns a pointer: struct Cat *newCat = malloc(sizeof(struct Cat)); // no malloc...but isn't it actually the same thing? uses memory as well, or not? struct Cat cat = {520.0f, 680.0f, NULL}; Basically, I can get a initialized structure in these two ways. My guess is: It's the same thing, but when I use malloc I also have to free() that. In the second case I don't have to think about memory, because I don't call malloc. Maybe. When should I use the malloc style, and when the other?

    Read the article

  • Store data in an inconvenient table or create a derived table?

    - by user1705685
    I have a certain predefined database structure that I am stuck with. The question is whether this structure is OK for ORM or I whether should add a processing layer that would create a more convenient structure every time something is inserted into the original DB. To simplify, here's what it kind of looks like. I have a person table: PersonId Name And I have a properties table: PersonId PropertyType PropertyValue So, for person John Doe... (1, 'John Doe') ...I could have three properties: (1, 'phone', '555-55-55'), (1, 'email', '[email protected]), (1, 'type', 'employee') By using ORM I would like to get a "person" object that would have properties "name", "phone", "email", "type". Can Propel do that? How efficient is it? Is it a better idea to create a table with columns "phone", "email", "type" and fill it automatically as new rows are inserted into the properties table?

    Read the article

  • What's the most simple way to retrieve all data from a table and save it back in .NET 3.5?

    - by zoman
    I have a number of tables containing some basic (business related) mapping data. What's the most simple way to load the data from those tables, then save the modified values back. (all data should be replaced in the tables) An ORM is out of question as I would like to avoid creating domain objects for each table. The actual editing of the data is not an issue. (it is exported into Excel where the data is edited, then the file is uploaded with the modified data) The technology is .NET 3.5 (ASP.NET MVC) and SQL Server 2005. Thanks.

    Read the article

  • requireJS : How to structure Javascript for an entire site?

    - by pagewil
    I have 3000+ lines of javascript that I need to get into a sensible/maintainable structure. I have chosen to use requireJS as it has been recommend to me by a few people. I have a bunch of variables that are used throughout the application and need to be available everywhere. I also have a bunch of functions that need to be available everywhere. Apart from these two dependencies most of the code can be divided off into their own modules. I am having trouble understanding how to manage my main variables so that if one module of code makes changes to the variables the rest of the JS modules will see that change. I think I need to see a few examples that demonstrate how requireJS is intended to work on a larger scale that the examples in the documentation. If anyone is an experienced requireJS user I would love the hear your tips!

    Read the article

  • Read file structure into an array, but only specific files.

    - by dmackerman
    I have a directory structure that looks like this: /expandables - folder - folder - folder - folder - BannerInfo.txt - index.html Each one of the folder has the same exact stucture. One file named BannerInfo.txt and index.html. There are about 250 of these folders if that matters. I want to loop through these folders and store each of the index.html files into an array. Inside of the index.html file is just some simple HTML and Javascript of which I want to read into a string to be displayed later on. I'm struggling with how to filter out only the index.html file from the individual folders. The purpose of this is because I want to randomly select an index.html file and put the contents into a textarea. I thought I could do a simple array_rand() on the returned array and spit out the string. Any ideas?

    Read the article

  • best way to store php data on a page for use with javascript/jquery?

    - by Haroldo
    Ok, so im trying to work out the fastest way of storing data on my page without slowing the page load: I need to store information in the page to be later used by jquery. My page is an events page and i want to attach data to each event anchor. there are 100+ events to attach data to. The events anchors are created with a php loop, so i could create the data elements within this loop using either use un-semantic tags ie *rel="some_data"* create a jquery.data() for each iteration of the loop or i could run the loop again, separately, this time inside script tags with jquery.data(); would really appreciate any thoughts on this!

    Read the article

  • Offsite data storage for simple app, or a similar supported persistence mechanism?

    - by jdk
    Question Is there a usable facebook entry point to the Data Storage API that facebook lists on their app admin page for developers, or should I consider an alternate mechanism? What alternative mechanisms exist to simply persist my information offsite (away from my server app) without stuffing it into a cookie that's prone to expire? ... Background The facebook Data Store Admin tool is made available in a facebook App's Settings as seen here: (continue reading below) However when I visit the DataStoreAdmin link nothing works (i.e. clicking the buttons to define the data store types and objects does nothing - I have tried different browsers). The Wiki page for Data Store API hasn't been updated recently and the second last update says the beta Data Store was taken offline. It seems odd the link would be readily available and highly visible at the top of the App configuration area if indeed it's defunct. I was hoping some kind of key/value pair solution to remove the data calls from my own server.

    Read the article

  • How to structure data... Sequential or Hierarchical?

    - by Ryan
    I'm going through the exercise of building a CMS that will organize a lot of the common documents that my employer generates each time we get a new sales order. Each new sales order gets a 5 digit number (12222,12223,122224, etc...) but internally we have applied a hierarchy to these numbers: + 121XX |--01 |--02 + 122XX |--22 |--23 |--24 In my table for sales orders, is it better to use the 5 digital number as an ID and populate up or would it be better to use the hierarchical structure that we use when referring to jobs in regular conversation? The only benefit to not populating sequentially seems to be formatting the data later on in my view, but that doesn't sound like a good enough reason to go through the extra work. Thanks

    Read the article

  • Why is the GUID structure declared the way it is?

    - by alabamasucks
    In rpc.h, the GUID structure is declared as follows: typedef struct _GUID { DWORD Data1; WORD Data2; WORD Data3; BYTE Data[8]; } GUID; I understand Data1, Data2, and Data3. They define the first, second, and third sets of hex digits when writing out a GUID (XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXX). What I never understood was why the last 2 groups were declared together in the same byte array. Wouldn't this have made more sense (and been easier to code against)? typedef struct _GUID { DWORD Data1; WORD Data2; WORD Data3; WORD Data4; BYTE Data5[6]; } GUID; Anyone know why it is declared this way?

    Read the article

  • How to convert searchTwitter results (from library(twitteR)) into a data.frame?

    - by analyticsPierce
    I am working on saving twitter search results into a database (SQL Server) and am getting an error when I pull the search results from twitteR. If I execute: library(twitteR) puppy <- as.data.frame(searchTwitter("puppy", session=getCurlHandle(),num=100)) I get an error of: Error in as.data.frame.default(x[[i]], optional = TRUE) : cannot coerce class structure("status", package = "twitteR") into a data.frame This is important because in order to use RODBC to add this to a table using sqlSave it needs to be a data.frame. At least that's the error message I got: Error in sqlSave(localSQLServer, puppy, tablename = "puppy_staging", : should be a data frame So does anyone have any suggestions on how to coerce the list to a data.frame or how I can load the list through RODBC?

    Read the article

  • How do I unpack bits from a structure's stream_data in c code?

    - by Chelp
    Ex. typedef struct { bool streamValid; dword dateTime; dword timeStamp; stream_data[800]; } RadioDataA; Ex. Where stream_data[800] contains: **Variable** **Length (in bits)** packetID 8 packetL 8 versionMajor 4 versionMinor 4 radioID 8 etc.. I need to write: void unpackData(radioDataA *streamData, MA_DataA *maData) { //unpack streamData (from above) & put some of the data into maData //How do I read in bits of data? I know it's by groups of 8 but I don't understand how. //MAData is also a struct. }

    Read the article

  • Creating a file/folder structure and zipping it up?

    - by makeee
    I have a directory of image files and I need a php script or shell script that will rename them, create a structure of nested directories, and then insert each image into a specified place in the directory hierarchy. Ideally I would just specify a parent directory for each file and a parent directory for each directory and it would build it. And then finally, I need the script to zip up the whole thing. There's probably not an existing php class that will do all this for me, but if anyone knows of a php class or other script available online that would handle a lot of this logic that would be great.

    Read the article

  • RabbitMQ as a proxy between a data store and a producer ?

    - by hyperboreean
    I have some code that produces lots of data that should be stored in the database. The problem is that the database can't keep with the data that it gets produced. So I am wondering whether some kind of queuing mechanism would help in this situation - I am thinking in particular at RabiitMQ and whether is feasible to have the data stored in its queues until some consumer gets the data out of it and pushes it to the database. Also, I am not particular interested whether that data made it to the database or not because pretty soon, the same data will be updated.

    Read the article

  • How to fill a structure when a pointer to it, is passed as an argument to a function

    - by Ram
    I have a function: func (struct passwd* pw) { struct passwd* temp; struct passwd* save; temp = getpwnam("someuser"); /* since getpwnam returns a pointer to a static * data buffer, I am copying the returned struct * to a local struct. */ if(temp) { save = malloc(sizeof *save); if (save) { memcpy(save, temp, sizeof(struct passwd)); /* Here, I have to update passed pw* with this save struct. */ *pw = *save; /* (~ memcpy) */ } } } The function which calls func(pw) is able to get the updated information. But is it fine to use it as above. The statement *pw = *save is not a deep copy. I do not want to copy each and every member of structure one by one like pw-pw_shell = strdup(save-pw_shell) etc. Is there any better way to do it? Thanks.

    Read the article

  • Oracle OpenWorld 2013 – Wrap up by Sven Bernhardt

    - by JuergenKress
    OOW 2013 is over and we’re heading home, so it is time to lean back and reflecting about the impressions we have from the conference. First of all: OOW was great! It was a pleasure to be a part of it. As already mentioned in our last blog article: It was the biggest OOW ever. Parallel to the conference the America’s Cup took place in San Francisco and the Oracle Team America won. Amazing job by the team and again congratulations from our side Back to the conference. The main topics for us are: Oracle SOA / BPM Suite 12c Adaptive Case management (ACM) Big Data Fast Data Cloud Mobile Below we will go a little more into detail, what are the key takeaways regarding the mentioned points: Oracle SOA / BPM Suite 12c During the five days at OOW, first details of the upcoming major release of Oracle SOA Suite 12c and Oracle BPM Suite 12c have been introduced. Some new key features are: Managed File Transfer (MFT) for transferring big files from a source to a target location Enhanced REST support by introducing a new REST binding Introduction of a generic cloud adapter, which can be used to connect to different cloud providers, like Salesforce Enhanced analytics with BAM, which has been totally reengineered (BAM Console now also runs in Firefox!) Introduction of templates (OSB pipelines, component templates, BPEL activities templates) EM as a single monitoring console OSB design-time integration into JDeveloper (Really great!) Enterprise modeling capabilities in BPM Composer These are only a few points from what is coming with 12c. We are really looking forward for the new realese to come out, because this seems to be really great stuff. The suite becomes more and more integrated. From 10g to 11g it was an evolution in terms of developing SOA-based applications. With 12c, Oracle continues it’s way – very impressive. Adaptive Case Management Another fantastic topic was Adaptive Case Management (ACM). The Oracle PMs did a great job especially at the demo grounds in showing the upcoming Case Management UI (will be available in 11g with the next BPM Suite MLR Patch), the roadmap and the differences between traditional business process modeling. They have been very busy during the conference because a lot of partners and customers have been interested Big Data Big Data is one of the current hype themes. Because of huge data amounts from different internal or external sources, the handling of these data becomes more and more challenging. Companies have a need for analyzing the data to optimize their business. The challenge is here: the amount of data is growing daily! To store and analyze the data efficiently, it is necessary to have a scalable and flexible infrastructure. Here it is important that hardware and software are engineered to work together. Therefore several new features of the Oracle Database 12c, like the new in-memory option, have been presented by Larry Ellison himself. From a hardware side new server machines like Fujitsu M10 or new processors, such as Oracle’s new M6-32 have been announced. The performance improvements, when using one of these hardware components in connection with the improved software solutions were really impressive. For more details about this, please take look at our previous blog post. Regarding Big Data, Oracle also introduced their Big Data architecture, which consists of: Oracle Big Data Appliance that is preconfigured with Hadoop Oracle Exdata which stores a huge amount of data efficently, to achieve optimal query performance Oracle Exalytics as a fast and scalable Business analytics system Analysis of the stored data can be performed using SQL, by streaming the data directly from Hadoop to an Oracle Database 12c. Alternatively the analysis can be directly implemented in Hadoop using “R”. In addition Oracle BI Tools can be used to analyze the data. Fast Data Fast Data is a complementary approach to Big Data. A huge amount of mostly unstructured data comes in via different channels with a high frequency. The analysis of these data streams is also important for companies, because the incoming data has to be analyzed regarding business-relevant patterns in real-time. Therefore these patterns must be identified efficiently and performant. To do so, in-memory grid solutions in combination with Oracle Coherence and Oracle Event Processing demonstrated very impressive how efficient real-time data processing can be. One example for Fast Data solutions that was shown during the OOW was the analysis of twitter streams regarding customer satisfaction. The feeds with negative words like “bad” or “worse” have been filtered and after a defined treshold has been reached in a certain timeframe, a business event was triggered. Cloud Another key trend in the IT market is of course Cloud Computing and what it means for companies and their businesses. Oracle announced their Cloud strategy and vision – companies can focus on their real business while all of the applications are available via Cloud. This also includes Oracle Database or Oracle Weblogic, so that companies can also build, deploy and run their own applications within the cloud. Three different approaches have been introduced: Infrastructure as a Service (IaaS) Platform as a Service (PaaS) Software as a Service (SaaS) Using the IaaS approach only the infrastructure components will be managed in the Cloud. Customers will be very flexible regarding memory, storage or number of CPUs because those parameters can be adjusted elastically. The PaaS approach means that besides the infrastructure also the platforms (such as databases or application servers) necessary for running applications will be provided within the Cloud. Here customers can also decide, if installation and management of these infrastructure components should be done by Oracle. The SaaS approach describes the most complete one, hence all applications a company uses are managed in the Cloud. Oracle is planning to provide all of their applications, like ERP systems or HR applications, as Cloud services. In conclusion this seems to be a very forward-thinking strategy, which opens up new possibilities for customers to manage their infrastructure and applications in a flexible, scalable and future-oriented manner. As you can see, our OOW days have been very very interresting. We collected many helpful informations for our projects. The new innovations presented at the confernce are great and being part of this was even greater! We are looking forward to next years’ conference! Links: http://www.oracle.com/openworld/index.html http://thecattlecrew.wordpress.com/2013/09/23/first-impressions-from-oracle-open-world-2013 SOA & BPM Partner Community For regular information on Oracle SOA Suite become a member in the SOA & BPM Partner Community for registration please visit www.oracle.com/goto/emea/soa (OPN account required) If you need support with your account please contact the Oracle Partner Business Center. Blog Twitter LinkedIn Facebook Wiki Mix Forum Technorati Tags: cattleCrew,Sven Bernhard,OOW2013,SOA Community,Oracle SOA,Oracle BPM,Community,OPN,Jürgen Kress

    Read the article

  • External File Upload Optimizations for Windows Azure

    - by rgillen
    [Cross posted from here: http://rob.gillenfamily.net/post/External-File-Upload-Optimizations-for-Windows-Azure.aspx] I’m wrapping up a bit of the work we’ve been doing on data movement optimizations for cloud computing and the latest set of data yielded some interesting points I thought I’d share. The work done here is not really rocket science but may, in some ways, be slightly counter-intuitive and therefore seemed worthy of posting. Summary: for those who don’t like to read detailed posts or don’t have time, the synopsis is that if you are uploading data to Azure, block your data (even down to 1MB) and upload in parallel. Set your block size based on your source file size, but if you must choose a fixed value, use 1MB. Following the above will result in significant performance gains… upwards of 10x-24x and a reduction in overall file transfer time of upwards of 90% (eg, uploading a 1GB file averaged 46.37 minutes prior to optimizations and averaged 1.86 minutes afterwards). Detail: For those of you who want more detail, or think that the claims at the end of the preceding paragraph are over-reaching, what follows is information and code supporting these claims. As the title would indicate, these tests were run from our research facility pointing to the Azure cloud (specifically US North Central as it is physically closest to us) and do not represent intra-cloud results… we have performed intra-cloud tests and the overall results are similar in notion but the data rates are significantly different as well as the tipping points for the various block sizes… this will be detailed separately). We started by building a very simple console application that would loop through a directory and upload each file to Azure storage. This application used the shipping storage client library from the 1.1 version of the azure tools. The only real variation from the client library is that we added code to collect and record the duration (in ms) and size (in bytes) for each file transferred. The code is available here. We then created a directory that had a collection of files for the following sizes: 2KB, 32KB, 64KB, 128KB, 512KB, 1MB, 5MB, 10MB, 25MB, 50MB, 100MB, 250MB, 500MB, 750MB, and 1GB (50 files for each size listed). These files contained randomly-generated binary data and do not benefit from compression (a separate discussion topic). Our file generation tool is available here. The baseline was established by running the application described above against the directory containing all of the data files. This application uploads the files in a random order so as to avoid transferring all of the files of a given size sequentially and thereby spreading the affects of periodic Internet delays across the collection of results.  We then ran some scripts to split the resulting data and generate some reports. The raw data collected for our non-optimized tests is available via the links in the Related Resources section at the bottom of this post. For each file size, we calculated the average upload time (and standard deviation) and the average transfer rate (and standard deviation). As you likely are aware, transferring data across the Internet is susceptible to many transient delays which can cause anomalies in the resulting data. It is for this reason that we randomized the order of source file processing as well as executed the tests 50x for each file size. We expect that these steps will yield a sufficiently balanced set of results. Once the baseline was collected and analyzed, we updated the test harness application with some methods to split the source file into user-defined block sizes and then to upload those blocks in parallel (using the PutBlock() method of Azure storage). The parallelization was handled by simply relying on the Parallel Extensions to .NET to provide a Parallel.For loop (see linked source for specific implementation details in Program.cs, line 173 and following… less than 100 lines total). Once all of the blocks were uploaded, we called PutBlockList() to assemble/commit the file in Azure storage. For each block transferred, the MD5 was calculated and sent ensuring that the bits that arrived matched was was intended. The timer for the blocked/parallelized transfer method wraps the entire process (source file splitting, block transfer, MD5 validation, file committal). A diagram of the process is as follows: We then tested the affects of blocking & parallelizing the transfers by running the updated application against the same source set and did a parameter sweep on the block size including 256KB, 512KB, 1MB, 2MB, and 4MB (our assumption was that anything lower than 256KB wasn’t worth the trouble and 4MB is the maximum size of a block supported by Azure). The raw data for the parallel tests is available via the links in the Related Resources section at the bottom of this post. This data was processed and then compared against the single-threaded / non-optimized transfer numbers and the results were encouraging. The Excel version of the results is available here. Two semi-obvious points need to be made prior to reviewing the data. The first is that if the block size is larger than the source file size you will end up with a “negative optimization” due to the overhead of attempting to block and parallelize. The second is that as the files get smaller, the clock-time cost of blocking and parallelizing (overhead) is more apparent and can tend towards negative optimizations. For this reason (and is supported in the raw data provided in the linked worksheet) the charts and dialog below ignore source file sizes less than 1MB. (click chart for full size image) The chart above illustrates some interesting points about the results: When the block size is smaller than the source file, performance increases but as the block size approaches and then passes the source file size, you see decreasing benefit to the point of negative gains (see the values for the 1MB file size) For some of the moderately-sized source files, small blocks (256KB) are best As the size of the source file gets larger (see values for 50MB and up), the smallest block size is not the most efficient (presumably due, at least in part, to the increased number of blocks, increased number of individual transfer requests, and reassembly/committal costs). Once you pass the 250MB source file size, the difference in rate for 1MB to 4MB blocks is more-or-less constant The 1MB block size gives the best average improvement (~16x) but the optimal approach would be to vary the block size based on the size of the source file.    (click chart for full size image) The above is another view of the same data as the prior chart just with the axis changed (x-axis represents file size and plotted data shows improvement by block size). It again highlights the fact that the 1MB block size is probably the best overall size but highlights the benefits of some of the other block sizes at different source file sizes. This last chart shows the change in total duration of the file uploads based on different block sizes for the source file sizes. Nothing really new here other than this view of the data highlights the negative affects of poorly choosing a block size for smaller files.   Summary What we have found so far is that blocking your file uploads and uploading them in parallel results in significant performance improvements. Further, utilizing extension methods and the Task Parallel Library (.NET 4.0) make short work of altering the shipping client library to provide this functionality while minimizing the amount of change to existing applications that might be using the client library for other interactions.   Related Resources Source code for upload test application Source code for random file generator ODatas feed of raw data from non-optimized transfer tests Experiment Metadata Experiment Datasets 2KB Uploads 32KB Uploads 64KB Uploads 128KB Uploads 256KB Uploads 512KB Uploads 1MB Uploads 5MB Uploads 10MB Uploads 25MB Uploads 50MB Uploads 100MB Uploads 250MB Uploads 500MB Uploads 750MB Uploads 1GB Uploads Raw Data OData feeds of raw data from blocked/parallelized transfer tests Experiment Metadata Experiment Datasets Raw Data 256KB Blocks 512KB Blocks 1MB Blocks 2MB Blocks 4MB Blocks Excel worksheet showing summarizations and comparisons

    Read the article

< Previous Page | 441 442 443 444 445 446 447 448 449 450 451 452  | Next Page >