Search Results

Search found 883 results on 36 pages for 'subset'.

Page 1/36 | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

Detect duplicate in a subset from a set of elements

- by Abhinav Shrivastava

I have a set of numbers say : 1 1 2 8 5 6 6 7 8 8 4 2... I want to detect the duplicate element in subsets(of given size say k) of the above numbers... For example : Consider the increasing subsets(for example consider k=3) Subset 1 :{1,1,2} Subset 2 :{1,2,8} Subset 3 :{2,8,5} Subset 4 :{8,5,6} Subset 5 :{5,6,6} Subset 6 :{6,6,7} .... .... So my algorithm should detect that subset 1,5,6 contains duplicates.. My approach : 1)Copy the 1st k elements to a temporary array(vector) 2) using #include file in C++ STL...using unique() I would determine if there's any change in size of vector.. Any other clue how to approach this problem..

Read the article
Get a random subset from a set in F#

- by Cay

I am trying to think of an elegant way of getting a random subset from a set in F# Any thoughts on this? Perhaps this would work: say we have a set of 2x elements and we need to pick a subset of y elements. Then if we could generate an x sized bit random number that contains exactly y 2n powers we effectively have a random mask with y holes in it. We could keep generating new random numbers until we get the first one satisfying this constraint but is there a better way?

Read the article
using subset but old variables still left

- by user2520852

I am working with a data set, which is basically daily usage data (let's just say variable X and Y) by different cities (about 150 cities). I have created a subset of data for only specific cities, choosing just 3 of the 150 cities. Then when I do tapply by cities, I get means for 3 cities but also get NA for all other 147 cities that was in the data set. I am using the below coding df<-read.csv(...) df_sub<-subset(df,df$City==1|df$City==3|df$City==19) X_Breakdown<-tapply(X,df_sub$City, mean, na.rm=TRUE) Print(X_Breakdown) City 1 City 2 15 NA City 3 City 4 12 NA City 5 City 6 NA NA Hope you get the idea. I would like to get a dataset that only contains the 3 cities that I'm interested in. It seems that the set of variables is encoded in R, is there a way to fix this? Kindly advise. Thanks

Read the article
Ideas Related to Subset Sum with 2,3 and more integers

- by rolandbishop

I've been struggling with this problem just like everyone else and I'm quite sure there has been more than enough posts to explain this problem. However in terms of understanding it fully, I wanted to share my thoughts and get more efficient solutions from all the great people in here related to Subset Sum problem. I've searched it over the Internet and there is actually a lot sources but I'm really willing to re-implement an algorithm or finding my own in order to understand fully. The key thing I'm struggling with is the efficiency considering the set size will be large. (I do not have a limit, just conceptually large). The two phases I'm trying to implement ideas on is finding two numbers that are equal to given integer T, finding three numbers and eventually K numbers. Some ideas I've though; For the two integer part I'm thing basically sorting the array O(nlogn) and for each element in the array searching for its negative value. (i.e if the array element is 3 searching for -3). Maybe a hash table inclusion could be better, providing a O(1) indexing the element? For the three or more integers I've found an amazing blog post;http://www.skorks.com/2011/02/algorithms-a-dropbox-challenge-and-dynamic-programming/. However even the author itself states that it is not applicable for large numbers. So I was for 2 and 3 and more integers what ideas could be applied for the subset problem. I'm struggling with setting up a dynamic programming method that will be efficient for the large inputs as well.

Read the article
Subset Problem -- Any Materials?

- by bobber205

Yes this is a homework/lab assignment. I am interesting in coming up with/finding an algorithm (I can comprehend :P) for using "backtracking" to solve the subset sum problem. Anyone have some helpful resources? I've spent the last hour or so Googling with not much like finding something I think I could actually use. xD Thanks SO!

Read the article
subset a data.frame with multiple conditions

- by pslice

Suppose my data looks like this: 2372 Kansas KS2000111 HUMBOLDT, CITY OF ATRAZINE 1.3 05/07/2006 9104 Kansas KS2000111 HUMBOLDT, CITY OF ATRAZINE 0.34 07/23/2006 9212 Kansas KS2000111 HUMBOLDT, CITY OF ATRAZINE 0.33 02/11/2007 2094 Kansas KS2000111 HUMBOLDT, CITY OF ATRAZINE 1.4 05/06/2007 16763 Kansas KS2000111 HUMBOLDT, CITY OF ATRAZINE 0.61 05/11/2009 1076 Kansas KS2000111 HUMBOLDT, CITY OF METOLACHLOR 0.48 05/12/2002 1077 Kansas KS2000111 HUMBOLDT, CITY OF METOLACHLOR 0.3 05/07/2006 I want to be able to subset by the Analyte and a partial match on the date(namely I just want the year). I have been trying this, but I know it isn't quite right. data[data$Analyte=="ATRAZINE" & grep("2006",as.character(data$Date)),] Any suggestions?

Read the article
What is the better approach to find if a given set is a perfect subset of a set - If given subset is

- by Microkernel

Hi guys, What is the best approach to find if a given set(unsorted) is a perfect subset of a main set. I got to do some validation in my program where I got to compare the clients request set with the registered internal capability set. I thought of doing by having internal capability set sorted(will not change once registered) and do Binary search for each element in the client's request set. Is it the best I could get? I suspected that there might be better approach. Any idea? Regards, Microkernel

Read the article
Subset generation by rules

- by Sazug

Let's say that we have a 5000 users in database. User row has sex column, place where he/she was born column and status (married or not married) column. How to generate a random subset (let's say 100 users) that would satisfy these conditions: 40% should be males and 60% - females 50% should be born in USA, 20% born in UK, 20% born in Canada, 10% in Australia 70% should be married and 30% not. These conditions are independent, that is we cannot do like this: (0.4 * 0.5 * 0.7) * 100 = 14 users that are males, born in USA and married (0.4 * 0.5 * 0.3) * 100 = 6 users that are males, born in USA and not married. Is there an algorithm to this generation?

Read the article
Quickly retrieve the subset of properties used in a huge collection in C#

- by ccornet

I have a huge Collection (which I can cast as an enumerable using OfType<()) of objects. Each of these objects has a Category property, which is drawn from a list somewhere else in the application. This Collection can reach sizes of hundreds of items, but it is possible that only, say, 6/30 of the possible Categories are actually used. What is the fastest method to find these 6 Categories? The size of the huge Collection discourages me from just iterating across the entire thing and returning all unique values, so is there a faster method of accomplishing this? Ideally I'd collect the categories into a List.

Read the article
The subsets-sum problem and the solvability of NP-complete problems

- by G.E.M.

I was reading about the subset-sums problem when I came up with what appears to be a general-purpose algorithm for solving it: (defun subset-contains-sum (set sum) (let ((subsets) (new-subset) (new-sum)) (dolist (element set) (dolist (subset-sum subsets) (setf new-subset (cons element (car subset-sum))) (setf new-sum (+ element (cdr subset-sum))) (if (= new-sum sum) (return-from subset-contains-sum new-subset)) (setf subsets (cons (cons new-subset new-sum) subsets))) (setf subsets (cons (cons element element) subsets))))) "set" is a list not containing duplicates and "sum" is the sum to search subsets for. "subsets" is a list of cons cells where the "car" is a subset list and the "cdr" is the sum of that subset. New subsets are created from old ones in O(1) time by just cons'ing the element to the front. I am not sure what the runtime complexity of it is, but appears that with each element "sum" grows by, the size of "subsets" doubles, plus one, so it appears to me to at least be quadratic. I am posting this because my impression before was that NP-complete problems tend to be intractable and that the best one can usually hope for is a heuristic, but this appears to be a general-purpose solution that will, assuming you have the CPU cycles, always give you the correct answer. How many other NP-complete problems can be solved like this one?

Read the article
R: How to pass a list of selection expressions (strings in this case) to the subset function?

- by John

Here is some example data: data = data.frame(series = c("1a", "1b", "1e"), reading = c(0.1, 0.4, 0.6)) > data series reading 1 1a 0.1 2 1b 0.4 3 1e 0.6 Which I can pull out selective single rows using subset: > subset (data, series == "1a") series reading 1 1a 0.1 And pull out multiple rows using a logical OR > subset (data, series == "1a" | series == "1e") series reading 1 1a 0.1 3 1e 0.6 But if I have a long list of series expressions, this gets really annoying to input, so I'd prefer to define them in a better way, something like this: series_you_want = c("1a", "1e") (although even this sucks a little) and be able to do something like this, subset (data, series == series_you_want) The above obviously fails, I'm just not sure what the best way to do this is?

Read the article
need a recursive algorithm that determines whether there is a subset of a given length that adds up

- by user311674

this is similar to the subset sum problem but the subset has to be a certain length. need a recursive function

Read the article
Using a subset of GetHashCode() to increase AzureTable performance through partitioning

- by makerofthings7

Generally speaking, Azure Table IO performance improves as more partitions are used (with some tradeoffs in continuation tokens and batch updates I won't go into). Since the partition key is always a string I am considering using a "natural" load balancing technique based on a subset of the GetHashCode() of the partition key, and appending this subset to the partition key itself. This will allow all direct PK/RK queries to be computed with little overhead and with ease. Batch updates may just need an intermediate to group similar PKs together prior to submission. Question: Should I use GetHashCode() to compute the partition key? Is a better function available? If I use GetHashCode() does it matter which character I use for my PK? Is there an abstraction for Azure Table and Blob storage that does this for me already?

Read the article
How do I select a subset of the available fonts for a particular application

- by Aleve Sicofante

Having all those exotic fonts (for an European), like Chinese, Hindi or Russian fonts, is nice for a web browser. You never get those ugly unicode blocks and get the original glyphs instead. However, having the font menu in LibreOffice or AbiWord populated with all of those fonts is cumbersome and useless for most installations. Having more than a few fonts in note taking applications is also somewhat overkill. Is there a way I can designate a subset of all the available fonts to work with a particular application? I understand the app itself could do it, but I'm asking for a way to make LibreOffice, for instance, not see certain fonts, only my selection of "useful for text processing" subset.

Read the article
best way to pick a random subset from a collection?

- by Tom

I have a set of objects in a Vector from which I'd like to select a random subset (e.g. 100 items coming back; pick 5 randomly). In my first (very hasty) pass I did an extremely simple and perhaps overly clever solution: Vector itemsVector = getItems(); Collections.shuffle(itemsVector); itemsVector.setSize(5); While this has the advantage of being nice and simple, I suspect it's not going to scale very well, i.e. Collections.shuffle() must be O(n) at least. My less clever alternative is Vector itemsVector = getItems(); Random rand = new Random(System.currentTimeMillis()); // would make this static to the class List subsetList = new ArrayList(5); for (int i = 0; i < 5; i++) { // be sure to use Vector.remove() or you may get the same item twice subsetList.add(itemsVector.remove(rand.nextInt(itemsVector.size()))); } Any suggestions on better ways to draw out a random subset from a Collection?

Read the article
Efficient algorithm to find a maximum common subset of two sets?

- by datasunny

Each set contains bunch of checksums. For example: Set A: { 4445968d0e100ad08323df8c895cea15 a67f8052594d6ba3f75502c0b91b868f 07736dde2f8484a4a3af463e05f039e3 5b1e374ff2ba949ab49870ca24d3163a } Set B: { 6639e1da308fd7b04b7635a17450df7c 4445968d0e100ad08323df8c895cea15 a67f8052594d6ba3f75502c0b91b868f } The maximum common subset of A and B is: { 4445968d0e100ad08323df8c895cea15 a67f8052594d6ba3f75502c0b91b868f } A lot of this operations will be performed, so I'm looking for an efficient algorithm to do so. Thanks for your help.

Read the article
An algorithm for finding subset matching criteria?

- by Macin

I recently came up with a problem which I would like to share some thoughts about with someone on this forum. This relates to finding a subset. In reality it is more complicated, but I tried to present it here using some simpler concepts. To make things easier, I created this conceptual DB model: Let's assume this is a DB for storing recipes. Recipe can have many instructions steps and many ingredients. Ingredients are stored in a cupboard and we know how much of each ingredient we have. Now, when we create a recipe, we have to define how much of each ingredient we need. When we want to use a recipe, we would just check if required amount is less than available amount for each product and then decide if we can cook a dinner - if amount required for at least one ingredient is less than available amount - recipe cannot be cooked. Simple sql query to get the result. This is straightforward, but I'm wondering, how should I work when the problem is stated the other way round, i.e. how to find recipies which can be cooked only from ingredients that are available? I hope my explanation is clear, but if you need any more clarification, please ask.

Read the article
Beginner Question: For extract a large subset of a table from MySQL, how does Indexing, order of tab

- by chongman

Sorry if this is too simple, but thanks in advance for helping. This is for MySQL but might be relevant for other RDMBSs tblA has 4 columns: colA, colB, colC, mydata, A_id It has about 10^9 records, with 10^3 distinct values for colA, colB, colC. tblB has 3 columns: colA, colB, B_id It has about 10^4 records. I want all the records from tblA (except the A_id) that have a match in tblB. In other words, I want to use tblB to describe the subset that I want to extract and then extract those records from tblA. Namely: SELECT a.colA, a.colB, a.colC, a.mydata FROM tblA as a INNER JOIN tblB as b ON a.colA=b.colA a.colB=b.colB ; It's taking a really long time (more than an hour) on a newish computer (4GB, Core2Quad, ubuntu), and I just want to check my understanding of the following optimization steps. ** Suppose this is the only query I will ever run on these tables. So ignore the need to run other queries. Now my questions: 1) What indexes should I create to optimize this query? I think I just need a multiple index on (colA, colB) for both tables. I don't think I need separate indexes for colA and colB. Another stack overflow article (that I can't find) mentioned that when adding new indexes, it is slower when there are existing indexes, so that might be a reason to use the multiple index. 2) Is INNER JOIN correct? I just want results where a match is found. 3) Is it faster if I join (tblA to tblB) or the other way around, (tblB to tblA)? This previous answer says that the optimizer should take care of that. 4) Does the order of the part after ON matter? This previous answer say that the optimizer also takes care of the execution order.

Read the article
Boost graph: Apply algorithms considering a specific edge subset.

- by user323547

Hi, I've got a huge graph with typed edge (i.e. edge with a type property). Let the type set be {A,B,C,D}, I'd like to run the breadth first search algorithm considering only edges of type A or B. How would you do that ? Best, Ugo

Read the article
Is duck typing a subset of polymorphism

- by Raynos

From Polymorphism on WIkipedia In computer science, polymorphism is a programming language feature that allows values of different data types to be handled using a uniform interface. From duck typing on Wikipedia In computer programming with object-oriented programming languages, duck typing is a style of dynamic typing in which an object's current set of methods and properties determines the valid semantics, rather than its inheritance from a particular class or implementation of a specific interface. My interpretation is that based on duck typing, the objects methods/properties determine the valid semantics. Meaning that the objects current shape determines the interface it upholds. From polymorphism you can say a function is polymorphic if it accepts multiple different data types as long as they uphold an interface. So if a function can duck type, it can accept multiple different data types and operate on them as long as those data types have the correct methods/properties and thus uphold the interface. (Usage of the term interface is meant not as a code construct but more as a descriptive, documenting construct) What is the correct relationship between ducktyping and polymorphism ? If a language can duck type, does it mean it can do polymorphism ?

Read the article
View subset of files in folder in Windows 7

- by dev5

I have a folder with a thousands of files in it. There is a subset of about 100 files which i need to work with consistently. Is there a way to create some sort of virtual folder or view which only contains these files.

Read the article
Sitewide 301 Redirect with a subset of different redirects

- by Mike E.

I am trying to make a sitewide 301 redirect for a site with around 400 pages but also have a subset of about 10 individual pages that don't follow the sitewide redirect and should point somewhere else. Any ideas how to format such redirect rules so the sitewide redirect doesnt conflict with the subset pages redirect? I am starting with the sitewide redirect rule as: Options +FollowSymLinks RewriteEngine on RewriteRule (.*) http://www.name.com/$1 [R=301,L]

Read the article
assign subset of parent table to objects in R

- by Brandon

Hello, I would like to iterate through a table and break it into relvant parts based on the number of visits. I have tried several things but cannot seem to get it to work. I have included the code. for(i in 1:6){ paste("testing.visit",i,"\n",sep="") <- subset(testing,visit_no==2) } But I get the following error. Error in paste("testing.visit", i, "\n", sep = "") <- subset(testing, : target of assignment expands to non-language object Thank you, Brandon

Read the article
Dynamic programming solution to the subset-sum decision problem

- by Gail

How can a dynamic programming solution for the unbounded knapsack decision problem be used to come up with a dynamic programming solution to the subset-sum decision problem? This limitation seems to render the unbounded knapsack problem useless. In the unbounded knapsack, we simply store true or false for if some subset of integers sum up to our target value. However, if we have a limit on the frequency of the use of these integers, the optimal substructure at least appears to fail. How can this be done?

Read the article
Linqify this: Aggregate a subset of a list

- by JMarsch

Suppose I have a list or array of items, and I want to sum a subset of the items in the list. (in my case, it happens to always be a sequential subset). Here's the old-fashioned way: int sum = 0; for(int i = startIndex; i <= stopIndex; i++) sum += myList[i].TheValue; return sum; What's the best way to linqify that code?

Read the article

1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >