Search Results

Search found 214 results on 9 pages for 'tuples'.

Page 5/9 | < Previous Page | 1 2 3 4 5 6 7 8 9  | Next Page >

  • Basic C question, concerning memory allocation and value assignment

    - by VHristov
    Hi there, I have recently started working on my master thesis in C that I haven't used in quite a long time. Being used to Java, I'm now facing all kinds of problems all the time. I hope someone can help me with the following one, since I've been struggling with it for the past two days. So I have a really basic model of a database: tables, tuples, attributes and I'm trying to load some data into this structure. Following are the definitions: typedef struct attribute { int type; char * name; void * value; } attribute; typedef struct tuple { int tuple_id; int attribute_count; attribute * attributes; } tuple; typedef struct table { char * name; int row_count; tuple * tuples; } table; Data is coming from a file with inserts (generated for the Wisconsin benchmark), which I'm parsing. I have only integer or string values. A sample row would look like: insert into table values (9205, 541, 1, 1, 5, 5, 5, 5, 0, 1, 9205, 10, 11, 'HHHHHHH', 'HHHHHHH', 'HHHHHHH'); I've "managed" to load and parse the data and also to assign it. However, the assignment bit is buggy, since all values point to the same memory location, i.e. all rows look identical after I've loaded the data. Here is what I do: char value[10]; // assuming no value is longer than 10 chars int i, j, k; table * data = (table*) malloc(sizeof(data)); data->name = "table"; data->row_count = number_of_lines; data->tuples = (tuple*) malloc(number_of_lines*sizeof(tuple)); tuple* current_tuple; for(i=0; i<number_of_lines; i++) { current_tuple = &data->tuples[i]; current_tuple->tuple_id = i; current_tuple->attribute_count = 16; // static in our system current_tuple->attributes = (attribute*) malloc(16*sizeof(attribute)); for(k = 0; k < 16; k++) { current_tuple->attributes[k].name = attribute_names[k]; // for int values: current_tuple->attributes[k].type = DB_ATT_TYPE_INT; // write data into value-field int v = atoi(value); current_tuple->attributes[k].value = &v; // for string values: current_tuple->attributes[k].type = DB_ATT_TYPE_STRING; current_tuple->attributes[k].value = value; } // ... } While I am perfectly aware, why this is not working, I can't figure out how to get it working. I've tried following things, none of which worked: memcpy(current_tuple->attributes[k].value, &v, sizeof(int)); This results in a bad access error. Same for the following code (since I'm not quite sure which one would be the correct usage): memcpy(current_tuple->attributes[k].value, &v, 1); Not even sure if memcpy is what I need here... Also I've tried allocating memory, by doing something like: current_tuple->attributes[k].value = (int *) malloc(sizeof(int)); only to get "malloc: * error for object 0x100108e98: incorrect checksum for freed object - object was probably modified after being freed." As far as I understand this error, memory has already been allocated for this object, but I don't see where this happened. Doesn't the malloc(sizeof(attribute)) only allocate the memory needed to store an integer and two pointers (i.e. not the memory those pointers point to)? Any help would be greatly appreciated! Regards, Vassil

    Read the article

  • Discuss: PLs are characterised by which (iso)morphisms are implemented

    - by Yttrill
    I am interested to hear discussion of the proposition summarised in the title. As we know programming language constructions admit a vast number of isomorphisms. In some languages in some places in the translation process some of these isomorphisms are implemented, whilst others require code to be written to implement them. For example, in my language Felix, the isomorphism between a type T and a tuple of one element of type T is implemented, meaning the two types are indistinguishable (identical). Similarly, a tuple of N values of the same type is not merely isomorphic to an array, it is an array: the isomorphism is implemented by the compiler. Many other isomorphisms are not implemented for example there is an isomorphism expressed by the following client code: match v with | ((?x,?y),?z = x,(y,z) // Felix match v with | (x,y), - x,(y,z) (* Ocaml *) As another example, a type constructor C of int in Felix may be used directly as a function, whilst in Ocaml you must write a wrapper: let c x = C x Another isomorphism Felix implements is the elimination of unit values, including those in tuples: Felix can do this because (most) polymorphic values are monomorphised which can be done because it is a whole program analyser, Ocaml, for example, cannot do this easily because it supports separate compilation. For the same reason Felix performs type-class dispatch at compile time whilst Haskell passes around dictionaries. There are some quite surprising issues here. For example an array is just a tuple, and tuples can be indexed at run time using a match and returning a value of a corresponding sum type. Indeed, to be correct the index used is in fact a case of unit sum with N summands, rather than an integer. Yet, in a real implementation, if the tuple is an array the index is replaced by an integer with a range check, and the result type is replaced by the common argument type of all the constructors: two isomorphisms are involved here, but they're implemented partly in the compiler translation and partly at run time.

    Read the article

  • Why isn't DSM for unstructured memory done today?

    - by sinned
    Ages ago, Djikstra invented IPC through mutexes which then somehow led to shared memory (SHM) in multics (which afaik had the necessary mmap first). Then computer networks came up and DSM (distributed SHM) was invented for IPC between computers. So DSM is basically a not prestructured memory region (like a SHM) that magically get's synchronized between computers without the applications programmer taking action. Implementations include Treadmarks (inofficially dead now) and CRL. But then someone thought this is not the right way to do it and invented Linda & tuplespaces. Current implementations include JavaSpaces and GigaSpaces. Here, you have to structure your data into tuples. Other ways to achieve similar effects may be the use of a relational database or a key-value-store like RIAK. Although someone might argue, I don't consider them as DSM since there is no coherent memory region where you can put data structures in as you like but have to structure your data which can be hard if it is continuous and administration like locking can not be done for hard coded parts (=tuples, ...). Why is there no DSM implementation today or am I just unable to find one?

    Read the article

  • Generating a KML heatmap from given data set of [lat, lon, density]

    - by Will Croft
    I am looking to build a static KML (Google Earth markup) file which displays a heatmap-style rendering of a few given data sets in the form of [lat, lon, density] tuples. A very straightforward data set I have is for population density. My requirements are: must be able to feed in data for a given lat, lon must be able to specific the given density of the data at that lat, lon must export to KML The requirements are language agnostic for this project as I will be generating these files offline in order to build the KML used elsewhere. I have looked at a few projects, most notably heatmap.py, which is a port of gheat in Python with KML export. I have hit a brick wall in the sense that the projects I have found to date all rely on building the heatmap from the density of [lat, lon] points fed into the algorithm. If I am missing an obvious way to adapt my data set to feed in just the [lat, lon] tuples but adjusting how I feed them using the density values I have, I would love to know!

    Read the article

  • Using PIG with Hadoop, how do I regex match parts of text with an unknown number of groups?

    - by lmonson
    I'm using Amazon's elastic map reduce. I have log files that look something like this random text foo="1" more random text foo="2" more text noise foo="1" blah blah blah foo="1" blah blah foo="3" blah blah foo="4" ... How can I write a pig expression to pick out all the numbers in the 'foo' expressions? I prefer tuples that look something like this: (1,2) (1) (1,3,4) I've tried the following: TUPLES = foreach LINES generate FLATTEN(EXTRACT(line,'foo="([0-9]+)"')); But this yields only the first match in each line: (1) (1) (1)

    Read the article

  • Runtime error in sml function

    - by Rpant
    I am writing this function in sml fun removeNodeswithNoIncoming((x:int,y)::xs,element:int) = if x=element then removeNodeswithNoIncoming (xs , element) else (x,y) :: removeNodeswithNoIncoming (xs , element) | removeNodeswithNoIncoming(nil,element) = nil; the function takes list of tuples [(0,0),(0,1)(1,2)] and another element 0 if first element of the tuples is same as second parameter , it removes it from the list The o/p for the above list should be [(1,2)] Unfortunately , the code is not working for the above input. Though there are other inputs for which it works - removeNodeswithNoIncomingEdge([(0,1),(0,2),(1,2)],0); stdIn:30.1-30.30 Error: unbound variable or constructor: removeNodeswithNoIncomi ngEdge - removeNodeswithNoIncomingEdge([(0,1),(0,2),(1,2),(1,4)],0); stdIn:1.2-1.31 Error: unbound variable or constructor: removeNodeswithNoIncoming Edge

    Read the article

  • haskell sorting

    - by snorlaks
    Hello, How can it be done in most simply way to write (or maybe there is something embedded in haskell) function which takse as arguments list of tuples (String, Int) and Int x and return top x tuples as list according to x value. I wonder if its possible to write a function which also takes 3 argument which is the name of (or index) of filed in tuple according to which sorting has to be done. What are best solutions to make it quite generic, Im new to haskell I would do somethink like that in imperative languages without any problems but I want to know how to write it in quite good way in haskell, thanks for help

    Read the article

  • getting smallest of coordinates that differ by N or more in Python

    - by user248237
    suppose I have a list of coordinates: data = [[(10, 20), (100, 120), (0, 5), (50, 60)], [(13, 20), (300, 400), (100, 120), (51, 62)]] and I want to take all tuples that either appear in each list in data, or any tuple that differs from all tuples in lists other than its own by 3 or less. How can I do this efficiently in Python? For the above example, the results should be: [[(100, 120), # since it occurs in both lists (10, 20), (13, 20), # since they differ by only 3 (50, 60), (51, 60)]] (0, 5) and (300, 400) would not be included, since they don't appear in both lists and are not different from elements in lists other than their own by 3 or less. how can this be computed? thanks.

    Read the article

  • Google Python Class Day 1 Part 2

    Google Python Class Day 1 Part 2 Google Python Class Day 1 Part 2: Lists, Sorting, and Tuples. By Nick Parlante. Support materials and exercises: code.google.com From: GoogleDevelopers Views: 13 0 ratings Time: 35:12 More in Science & Technology

    Read the article

  • Tuple - .NET 4.0 new feature

    - by nmarun
    Something I hit while playing with .net 4.0 – Tuple. MSDN says ‘Provides static methods for creating tuple objects.’ and the example below is: 1: var primes = Tuple.Create(2, 3, 5, 7, 11, 13, 17, 19); Honestly, I’m still not sure with what intention MS provided us with this feature, but the moment I saw this, I said to myself – I could use it instead of anonymous types. In order to put this to test, I created an XML file: 1: <Activities> 2: <Activity id="1" name="Learn Tuples" eventDate="4/1/2010" /> 3: <Activity id="2" name="Finish Project" eventDate="4/29/2010" /> 4: <Activity id="3" name="Attend Birthday" eventDate="4/17/2010" /> 5: <Activity id="4" name="Pay bills" eventDate="4/12/2010" /> 6: </Activities> In my console application, I read this file and let’s say I want to pull all the attributes of the node with id value of 1. Now, I have two ways – either define a class/struct that has these three properties and use in the LINQ query or create an anonymous type on the fly. But if we go the .NET 4.0 way, we can do this using Tuples as well. Let’s see the code I’ve written below: 1: var myActivity = (from activity in loaded.Descendants("Activity") 2:       where (int)activity.Attribute("id") == 1 3:       select Tuple.Create( 4: int.Parse(activity.Attribute("id").Value), 5: activity.Attribute("name").Value, 6: DateTime.Parse(activity.Attribute("eventDate").Value))).FirstOrDefault(); Line 3 is where I’m using a Tuple.Create to define my return type. There are three ‘items’ (that’s what the elements are called) in ‘myActivity’ type.. aptly declared as Item1, Item2, Item3. So there you go, you have another way of creating anonymous types. Just out of curiosity, wanted to see what the type actually looked like. So I did a: 1: Console.WriteLine(myActivity.GetType().FullName); and the return was (formatted for better readability): "System.Tuple`3[                            [System.Int32, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089],                            [System.String, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089],                            [System.DateTime, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]                           ]" The `3 specifies the number of items in the tuple. The other interesting thing about the tuple is that it knows the data type of the elements it’s holding. This is shown in the above snippet and also when you hover over myActivity.Item1, it shows the type as an int, Item2 as string and Item3 as DateTime. So you can safely do: 1: int id = myActivity.Item1; 2: string name = myActivity.Item2; 3: DateTime eventDate = myActivity.Item3; Wow.. all I can say is: HAIL 4.0.. HAIL 4.0.. HAIL 4.0

    Read the article

  • Asynchrony in C# 5 (Part II)

    - by javarg
    This article is a continuation of the series of asynchronous features included in the new Async CTP preview for next versions of C# and VB. Check out Part I for more information. So, let’s continue with TPL Dataflow: Asynchronous functions TPL Dataflow Task based asynchronous Pattern Part II: TPL Dataflow Definition (by quote of Async CTP doc): “TPL Dataflow (TDF) is a new .NET library for building concurrent applications. It promotes actor/agent-oriented designs through primitives for in-process message passing, dataflow, and pipelining. TDF builds upon the APIs and scheduling infrastructure provided by the Task Parallel Library (TPL) in .NET 4, and integrates with the language support for asynchrony provided by C#, Visual Basic, and F#.” This means: data manipulation processed asynchronously. “TPL Dataflow is focused on providing building blocks for message passing and parallelizing CPU- and I/O-intensive applications”. Data manipulation is another hot area when designing asynchronous and parallel applications: how do you sync data access in a parallel environment? how do you avoid concurrency issues? how do you notify when data is available? how do you control how much data is waiting to be consumed? etc.  Dataflow Blocks TDF provides data and action processing blocks. Imagine having preconfigured data processing pipelines to choose from, depending on the type of behavior you want. The most basic block is the BufferBlock<T>, which provides an storage for some kind of data (instances of <T>). So, let’s review data processing blocks available. Blocks a categorized into three groups: Buffering Blocks Executor Blocks Joining Blocks Think of them as electronic circuitry components :).. 1. BufferBlock<T>: it is a FIFO (First in First Out) queue. You can Post data to it and then Receive it synchronously or asynchronously. It synchronizes data consumption for only one receiver at a time (you can have many receivers but only one will actually process it). 2. BroadcastBlock<T>: same FIFO queue for messages (instances of <T>) but link the receiving event to all consumers (it makes the data available for consumption to N number of consumers). The developer can provide a function to make a copy of the data if necessary. 3. WriteOnceBlock<T>: it stores only one value and once it’s been set, it can never be replaced or overwritten again (immutable after being set). As with BroadcastBlock<T>, all consumers can obtain a copy of the value. 4. ActionBlock<TInput>: this executor block allows us to define an operation to be executed when posting data to the queue. Thus, we must pass in a delegate/lambda when creating the block. Posting data will result in an execution of the delegate for each data in the queue. You could also specify how many parallel executions to allow (degree of parallelism). 5. TransformBlock<TInput, TOutput>: this is an executor block designed to transform each input, that is way it defines an output parameter. It ensures messages are processed and delivered in order. 6. TransformManyBlock<TInput, TOutput>: similar to TransformBlock but produces one or more outputs from each input. 7. BatchBlock<T>: combines N single items into one batch item (it buffers and batches inputs). 8. JoinBlock<T1, T2, …>: it generates tuples from all inputs (it aggregates inputs). Inputs could be of any type you want (T1, T2, etc.). 9. BatchJoinBlock<T1, T2, …>: aggregates tuples of collections. It generates collections for each type of input and then creates a tuple to contain each collection (Tuple<IList<T1>, IList<T2>>). Next time I will show some examples of usage for each TDF block. * Images taken from Microsoft’s Async CTP documentation.

    Read the article

  • Types issue in F#

    - by Andry
    Hello! In my ongoing adventure deep diving into f# I am understanding a lot of this powerful language but there are things that I still do not understand so clearly. One of the most important issues I need to master is types. Well the book I am reading is very straight forward and introduces entities and main functionalities with a direct approach. The first thing I could get start with is types. It introduces the main types as list, option, tuples, and so on... It is clearly underlined that all these types are IMMUTABLE for many reasons regarding functional programming and data consistance in functional programing. Well, no problems until now... But now I am getting started with Concrete Types... Well... I have problems in managing with types like list, option, tuples, types created through new operator and concrete types created using type keyword (for abbreviations, concrete types...). So my question is: how can I efficently catalogue/distinguish all types of data in f#???? I can create a perfect separation among types in C#, VB.NET... FOr example in VB.NET there are value and reference types while in C# there are only references and also int, double are treated as objects (they are objects while in VB.NET a value type is not a object and there is a split in types for this reason). Well in F# I cannot create such differences among types in the language. Can you help me? I hope I was clear.

    Read the article

  • Ocaml Pattern Matching

    - by Atticus
    Hey guys, I'm pretty new to OCaml and pattern matching, so I was having a hard time trying to figure this out. Say that I have a list of tuples. What I want to do is match a parameter with one of the tuples based on the first element in the tuple, and upon doing so, I want to return the second element of the tuple. So for example, I want to do something like this: let list = [ "a", 1; "b", 2"; "c", 3; "d", 4 ] ;; let map_left_to_right e rules = match e with | first -> second | first -> second | first -> second If I use map_left_to_right "b" list, I want to get 2 in return. I therefore want to list out all first elements in the list of rules and match the parameter with one of these elements, but I am not sure how to do so. I was thinking that I need to use either List.iter or List.for_all to do something like this. Any help would be appreciated. Thanks!

    Read the article

  • Setting an Excel Range with an Array using Python and comtypes?

    - by technomalogical
    Using comtypes to drive Python, it seems some magic is happening behind the scenes that is not converting tuples and lists to VARIANT types: # RANGE(“C14:D21”) has values # Setting the Value on the Range with a Variant should work, but # list or tuple is not getting converted properly it seems >>>from comtypes.client import CreateObject >>>xl = CreateObject("Excel.application") >>>xl.Workbooks.Open(r'C:\temp\my_file.xlsx') >>>xl.Visible = True >>>vals=tuple([(x,y) for x,y in zip('abcdefgh',xrange(8))]) # creates: #(('a', 0), ('b', 1), ('c', 2), ('d', 3), ('e', 4), ('f', 5), ('g', 6), ('h', 7)) >>>sheet = xl.Workbooks[1].Sheets["Sheet1"] >>>sheet.Range["C14","D21"].Value() (('foo',1),('foo',2),('foo',3),('foo',4),('foo',6),('foo',6),('foo',7),('foo',8)) >>>sheet.Range["C14","D21"].Value[()] = vals # no error, this blanks out the cells in the Range According to the comtypes docs: When you pass simple sequences (lists or tuples) as VARIANT parameters, the COM server will receive a VARIANT containing a SAFEARRAY of VARIANTs with the typecode VT_ARRAY | VT_VARIANT. This seems to be inline with what MSDN says about passing an array to a Range's Value. I also found this page showing something similar in C#. Can anybody tell me what I'm doing wrong? EDIT I've come up with a simpler example that performs the same way (in that, it does not work): >>>from comtypes.client import CreateObject >>>xl = CreateObject("Excel.application") >>>xl.Workbooks.Add() >>>sheet = xl.Workbooks[1].Sheets["Sheet1"] # at this point, I manually typed into the range A1:B3 >>> sheet.Range("A1","B3").Value() ((u'AAA', 1.0), (u'BBB', 2.0), (u'CCC', 3.0)) >>>sheet.Range("A1","B3").Value[()] = [(x,y) for x,y in zip('xyz',xrange(3))] # Using a generator expression, per @Mike's comment # However, this still blanks out my range :(

    Read the article

  • Python: Created nested dictionary from list of paths

    - by sberry2A
    I have a list of tuples the looks similar to this (simplified here, there are over 14,000 of these tuples with more complicated paths than Obj.part) [ (Obj1.part1, {<SPEC>}), (Obj1.partN, {<SPEC>}), (ObjK.partN, {<SPEC>}) ] Where Obj goes from 1 - 1000, part from 0 - 2000. These "keys" all have a dictionary of specs associated with them which act as a lookup reference for inspecting another binary file. The specs dict contains information such as the bit offset, bit size, and C type of the data pointed to by the path ObjK.partN. For example: Obj4.part500 might have this spec, {'size':32, 'offset':128, 'type':'int'} which would let me know that to access Obj4.part500 in the binary file I must unpack 32 bits from offset 128. So, now I want to take my list of strings and create a nested dictionary which in the simplified case will look like this data = { 'Obj1' : {'part1':{spec}, 'partN':{spec} }, 'ObjK' : {'part1':{spec}, 'partN':{spec} } } To do this I am currently doing two things, 1. I am using a dotdict class to be able to use dot notation for dictionary get / set. That class looks like this: class dotdict(dict): def __getattr__(self, attr): return self.get(attr, None) __setattr__ = dict.__setitem__ __delattr__ = dict.__delitem__ The method for creating the nested "dotdict"s looks like this: def addPath(self, spec, parts, base): if len(parts) > 1: item = base.setdefault(parts[0], dotdict()) self.addPath(spec, parts[1:], item) else: item = base.setdefault(parts[0], spec) return base Then I just do something like: for path, spec in paths: self.lookup = dotdict() self.addPath(spec, path.split("."), self.lookup) So, in the end self.lookup.Obj4.part500 points to the spec. Is there a better (more pythonic) way to do this?

    Read the article

  • SQL indexes for "not equal" searches

    - by bortzmeyer
    The SQL index allows to find quickly a string which matches my query. Now, I have to search in a big table the strings which do not match. Of course, the normal index does not help and I have to do a slow sequential scan: essais=> \d phone_idx Index "public.phone_idx" Column | Type --------+------ phone | text btree, for table "public.phonespersons" essais=> EXPLAIN SELECT person FROM PhonesPersons WHERE phone = '+33 1234567'; QUERY PLAN ------------------------------------------------------------------------------- Index Scan using phone_idx on phonespersons (cost=0.00..8.41 rows=1 width=4) Index Cond: (phone = '+33 1234567'::text) (2 rows) essais=> EXPLAIN SELECT person FROM PhonesPersons WHERE phone != '+33 1234567'; QUERY PLAN ---------------------------------------------------------------------- Seq Scan on phonespersons (cost=0.00..18621.00 rows=999999 width=4) Filter: (phone <> '+33 1234567'::text) (2 rows) I understand (see Mark Byers' very good explanations) that PostgreSQL can decide not to use an index when it sees that a sequential scan would be faster (for instance if almost all the tuples match). But, here, "not equal" searches are really slower. Any way to make these "is not equal to" searches faster? Here is another example, to address Mark Byers' excellent remarks. The index is used for the '=' query (which returns the vast majority of tuples) but not for the '!=' query: essais=> EXPLAIN ANALYZE SELECT person FROM EmailsPersons WHERE tld(email) = 'fr'; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------ Index Scan using tld_idx on emailspersons (cost=0.25..4010.79 rows=97033 width=4) (actual time=0.137..261.123 rows=97110 loops=1) Index Cond: (tld(email) = 'fr'::text) Total runtime: 444.800 ms (3 rows) essais=> EXPLAIN ANALYZE SELECT person FROM EmailsPersons WHERE tld(email) != 'fr'; QUERY PLAN -------------------------------------------------------------------------------------------------------------------- Seq Scan on emailspersons (cost=0.00..27129.00 rows=2967 width=4) (actual time=1.004..1031.224 rows=2890 loops=1) Filter: (tld(email) <> 'fr'::text) Total runtime: 1037.278 ms (3 rows) DBMS is PostgreSQL 8.3 (but I can upgrade to 8.4).

    Read the article

  • Efficient way to remove empty lists from lists without evaluating held expressions?

    - by Alexey Popkov
    In previous thread an efficient way to remove empty lists ({}) from lists was suggested: Replace[expr, x_List :> DeleteCases[x, {}], {0, Infinity}] Using the Trott-Strzebonski in-place evaluation technique this method can be generalized for working also with held expressions: f1[expr_] := Replace[expr, x_List :> With[{eval = DeleteCases[x, {}]}, eval /; True], {0, Infinity}] This solution is more efficient than the one based on ReplaceRepeated: f2[expr_] := expr //. {left___, {}, right___} :> {left, right} But it has one disadvantage: it evaluates held expressions if they are wrapped by List: In[20]:= f1[Hold[{{}, 1 + 1}]] Out[20]= Hold[{2}] So my question is: what is the most efficient way to remove all empty lists ({}) from lists without evaluating held expressions? The empty List[] object should be removed only if it is an element of another List itself. Here are some timings: In[76]:= expr = Tuples[Tuples[{{}, {}}, 3], 4]; First@Timing[#[expr]] & /@ {f1, f2, f3} pl = Plot3D[Sin[x y], {x, 0, Pi}, {y, 0, Pi}]; First@Timing[#[pl]] & /@ {f1, f2, f3} Out[77]= {0.581, 0.901, 5.027} Out[78]= {0.12, 0.21, 0.18} Definitions: Clear[f1, f2, f3]; f3[expr_] := FixedPoint[ Function[e, Replace[e, {a___, {}, b___} :> {a, b}, {0, Infinity}]], expr]; f1[expr_] := Replace[expr, x_List :> With[{eval = DeleteCases[x, {}]}, eval /; True], {0, Infinity}]; f2[expr_] := expr //. {left___, {}, right___} :> {left, right};

    Read the article

  • How to protect SHTML pages from crawlers/spiders/scrapers?

    - by Adam Lynch
    I have A LOT of SHTML pages I want to protect from crawlers, spiders & scrapers. I understand the limitations of SSIs. An implementation of the following can be suggested in conjunction with any technology/technologies you wish: The idea is that if you request too many pages too fast you're added to a blacklist for 24 hrs and shown a captcha instead of content, upon every page you request. If you enter the captcha correctly you've removed from the blacklist. There is a whitelist so GoogleBot, etc. will never get blocked. Which is the best/easiest way to implement this idea? Server = IIS Cleaning out the old tuples from a DB every 24 hrs is easily done so no need to explain that.

    Read the article

  • Scuttlebutt Reconciliation from "Efficient Reconciliation and Flow Control for Anti-Entropy Protocols"

    - by Maus
    This question might be more suited to math.stackexchange.com, but here goes: Their Version Reconciliation takes two parts-- first the exchange of digests, and then an exchange of updates. I'll first paraphrase the paper's description of each step. To exchange digests, two peers send one another a set of pairs-- (peer, max_version) for each peer in the network, and then each one responds with a set of deltas. The deltas look like: (peer, key, value, version), for all tuples for which peer's state maps the key to the given value and version, and the version number is greater than the maximum version number peer has seen. This seems to require that each node remember the state of each other node, and the highest version number and ID each node has seen. Question Why must we iterate through all peers to exchange information between p and q?

    Read the article

  • Is there a name for a mini class that is not a struct?

    - by user1555300
    I have a couple of mini-classes that are not nested classes. They need to be passed around between different larger classes to use. In a way, they act like Tuples, storing fields in them. For example, [Serializable] public class TransformObject { public GameObject GameObj; public tk2dCameraAnchor Anchor; public ManagerTransform MTransform; } I have a few of them for my game I have been developing. They have to be classes, not struct because the Unity3d editor will not show in the inspector if so. I was just wondering if there is a official name for these kind of mini classes.

    Read the article

  • Sending a Tuple object over WCF?

    - by Donut
    Is the System.Tuple class supported by WCF's Data Contract Serializer (i.e., can I pass Tuple objects to WCF calls and/or receive them as part or all of the result)? I found this page, but not the clear, definitive "you can send and receive Tuples with WCF" answer I was hoping for. I'm guessing that you can, as long as all of the types within the Tuple itself are supported by the Data Contract Serializer -- can anyone provide me with a more definitive answer? Thanks.

    Read the article

  • How to convert strings into integers in python?

    - by elfuego1
    Hello there, I have a tuple of tuples from MySQL query like this: T1 = (('13', '17', '18', '21', '32'), ('07', '11', '13', '14', '28'), ('01', '05', '06', '08', '15', '16')) I'd like to convert all the string elements into integers and put it back nicely to list of lists this time: T2 = [[13, 17, 18, 21, 32], [7, 11, 13, 14, 28], [1, 5, 6, 8, 15, 16]] I tried to achieve it with "eval" but didn't get any decent result yet.

    Read the article

  • Hadoop/Pig Cross-join

    - by sagie
    Hi I am using Pig to cross join two data sets, both with format. This will result as . In my case, if I have both tuples and , it is a duplication. Can I filter those duplications (or not joining them at all)? thanks, Sagie

    Read the article

  • Solving embarassingly parallel problems using Python multiprocessing

    - by gotgenes
    How does one use multiprocessing to tackle embarrassingly parallel problems? Embarassingly parallel problems typically consist of three basic parts: Read input data (from a file, database, tcp connection, etc.). Run calculations on the input data, where each calculation is independent of any other calculation. Write results of calculations (to a file, database, tcp connection, etc.). We can parallelize the program in two dimensions: Part 2 can run on multiple cores, since each calculation is independent; order of processing doesn't matter. Each part can run independently. Part 1 can place data on an input queue, part 2 can pull data off the input queue and put results onto an output queue, and part 3 can pull results off the output queue and write them out. This seems a most basic pattern in concurrent programming, but I am still lost in trying to solve it, so let's write a canonical example to illustrate how this is done using multiprocessing. Here is the example problem: Given a CSV file with rows of integers as input, compute their sums. Separate the problem into three parts, which can all run in parallel: Process the input file into raw data (lists/iterables of integers) Calculate the sums of the data, in parallel Output the sums Below is traditional, single-process bound Python program which solves these three tasks: #!/usr/bin/env python # -*- coding: UTF-8 -*- # basicsums.py """A program that reads integer values from a CSV file and writes out their sums to another CSV file. """ import csv import optparse import sys def make_cli_parser(): """Make the command line interface parser.""" usage = "\n\n".join(["python %prog INPUT_CSV OUTPUT_CSV", __doc__, """ ARGUMENTS: INPUT_CSV: an input CSV file with rows of numbers OUTPUT_CSV: an output file that will contain the sums\ """]) cli_parser = optparse.OptionParser(usage) return cli_parser def parse_input_csv(csvfile): """Parses the input CSV and yields tuples with the index of the row as the first element, and the integers of the row as the second element. The index is zero-index based. :Parameters: - `csvfile`: a `csv.reader` instance """ for i, row in enumerate(csvfile): row = [int(entry) for entry in row] yield i, row def sum_rows(rows): """Yields a tuple with the index of each input list of integers as the first element, and the sum of the list of integers as the second element. The index is zero-index based. :Parameters: - `rows`: an iterable of tuples, with the index of the original row as the first element, and a list of integers as the second element """ for i, row in rows: yield i, sum(row) def write_results(csvfile, results): """Writes a series of results to an outfile, where the first column is the index of the original row of data, and the second column is the result of the calculation. The index is zero-index based. :Parameters: - `csvfile`: a `csv.writer` instance to which to write results - `results`: an iterable of tuples, with the index (zero-based) of the original row as the first element, and the calculated result from that row as the second element """ for result_row in results: csvfile.writerow(result_row) def main(argv): cli_parser = make_cli_parser() opts, args = cli_parser.parse_args(argv) if len(args) != 2: cli_parser.error("Please provide an input file and output file.") infile = open(args[0]) in_csvfile = csv.reader(infile) outfile = open(args[1], 'w') out_csvfile = csv.writer(outfile) # gets an iterable of rows that's not yet evaluated input_rows = parse_input_csv(in_csvfile) # sends the rows iterable to sum_rows() for results iterable, but # still not evaluated result_rows = sum_rows(input_rows) # finally evaluation takes place as a chain in write_results() write_results(out_csvfile, result_rows) infile.close() outfile.close() if __name__ == '__main__': main(sys.argv[1:]) Let's take this program and rewrite it to use multiprocessing to parallelize the three parts outlined above. Below is a skeleton of this new, parallelized program, that needs to be fleshed out to address the parts in the comments: #!/usr/bin/env python # -*- coding: UTF-8 -*- # multiproc_sums.py """A program that reads integer values from a CSV file and writes out their sums to another CSV file, using multiple processes if desired. """ import csv import multiprocessing import optparse import sys NUM_PROCS = multiprocessing.cpu_count() def make_cli_parser(): """Make the command line interface parser.""" usage = "\n\n".join(["python %prog INPUT_CSV OUTPUT_CSV", __doc__, """ ARGUMENTS: INPUT_CSV: an input CSV file with rows of numbers OUTPUT_CSV: an output file that will contain the sums\ """]) cli_parser = optparse.OptionParser(usage) cli_parser.add_option('-n', '--numprocs', type='int', default=NUM_PROCS, help="Number of processes to launch [DEFAULT: %default]") return cli_parser def main(argv): cli_parser = make_cli_parser() opts, args = cli_parser.parse_args(argv) if len(args) != 2: cli_parser.error("Please provide an input file and output file.") infile = open(args[0]) in_csvfile = csv.reader(infile) outfile = open(args[1], 'w') out_csvfile = csv.writer(outfile) # Parse the input file and add the parsed data to a queue for # processing, possibly chunking to decrease communication between # processes. # Process the parsed data as soon as any (chunks) appear on the # queue, using as many processes as allotted by the user # (opts.numprocs); place results on a queue for output. # # Terminate processes when the parser stops putting data in the # input queue. # Write the results to disk as soon as they appear on the output # queue. # Ensure all child processes have terminated. # Clean up files. infile.close() outfile.close() if __name__ == '__main__': main(sys.argv[1:]) These pieces of code, as well as another piece of code that can generate example CSV files for testing purposes, can be found on github. I would appreciate any insight here as to how you concurrency gurus would approach this problem. Here are some questions I had when thinking about this problem. Bonus points for addressing any/all: Should I have child processes for reading in the data and placing it into the queue, or can the main process do this without blocking until all input is read? Likewise, should I have a child process for writing the results out from the processed queue, or can the main process do this without having to wait for all the results? Should I use a processes pool for the sum operations? If yes, what method do I call on the pool to get it to start processing the results coming into the input queue, without blocking the input and output processes, too? apply_async()? map_async()? imap()? imap_unordered()? Suppose we didn't need to siphon off the input and output queues as data entered them, but could wait until all input was parsed and all results were calculated (e.g., because we know all the input and output will fit in system memory). Should we change the algorithm in any way (e.g., not run any processes concurrently with I/O)?

    Read the article

< Previous Page | 1 2 3 4 5 6 7 8 9  | Next Page >