Search Results

Search found 4640 results on 186 pages for 'unique'.

Page 3/186 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

Handling close-to-impossible collisions on should-be-unique values

- by balpha

There are many systems that depend on the uniqueness of some particular value. Anything that uses GUIDs comes to mind (eg. the Windows registry or other databases), but also things that create a hash from an object to identify it and thus need this hash to be unique. A hash table usually doesn't mind if two objects have the same hash because the hashing is just used to break down the objects into categories, so that on lookup, not all objects in the table, but only those objects in the same category (bucket) have to be compared for identity to the searched object. Other implementations however (seem to) depend on the uniqueness. My example (that's what lead me to asking this) is Mercurial's revision IDs. An entry on the Mercurial mailing list correctly states The odds of the changeset hash colliding by accident in your first billion commits is basically zero. But we will notice if it happens. And you'll get to be famous as the guy who broke SHA1 by accident. But even the tiniest probability doesn't mean impossible. Now, I don't want an explanation of why it's totally okay to rely on the uniqueness (this has been discussed here for example). This is very clear to me. Rather, I'd like to know (maybe by means of examples from your own work): Are there any best practices as to covering these improbable cases anyway? Should they be ignored, because it's more likely that particularly strong solar winds lead to faulty hard disk reads? Should they at least be tested for, if only to fail with a "I give up, you have done the impossible" message to the user? Or should even these cases get handled gracefully? For me, especially the following are interesting, although they are somewhat touchy-feely: If you don't handle these cases, what do you do against gut feelings that don't listen to probabilities? If you do handle them, how do you justify this work (to yourself and others), considering there are more probable cases you don't handle, like a supernonva?

Read the article
Generate unique ID from multiple values with fault tolerance

- by ojreadmore

Given some values, I'd like to make a (pretty darn) unique result. $unique1 = generate(array('ab034', '981kja7261', '381jkfa0', 'vzcvqdx2993883i3ifja8', '0plnmjfys')); //now $unique1 == "sqef3452y"; I also need something that's pretty close to return the same result. In this case, 20% of the values is missing. $unique2 = generate(array('ab034', '981kja7261', '381jkfa0', 'vzcvqdx2993883i3ifja8')); //also $unique2 == "sqef3452y"; I'm not sure where to begin with such an algorithm but I have some assumptions. I assume that the more values given, the more accurate the resulting ID – in other words, using 20 values is better than 5. I also assume that a confidence factor can be calculated and adjusted. What would be nice to have is a weight factor where one can say 'value 1 is more important than value 3'. This would require a multidimensional array for input instead of one dimension. I just mashed on the keyboard for these values, but in practice they may be short or long alpha numeric values.

Read the article
XSL unique values per node

- by Nathan

ok i have this xml <roots> <root> <name>first</name> <item type='test'><something>A</something></item> <item type='test'><something>B</something></item> <item type='test'><something>C</something></item> <item type='test'><something>A</something></item> <item type='other'><something>A</something></item> <item type='test'><something>B</something></item> <item type='other'><something>D</something></item> </root> <root> <name>second</name> <item type='test'><something>E</something></item> <item type='test'><something>B</something></item> <item type='test'><something>F</something></item> <item type='test'><something>A</something></item> <item type='other'><something>A</something></item> <item type='test'><something>B</something></item> <item type='other'><something>D</something></item> </root> </roots> now i need to get the unique values of each root node so far i have <?xml version="1.0" encoding="ISO-8859-1"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output indent="yes" method="text"/> <xsl:key name="item-by-value" match="something" use="."/> <xsl:key name="rootkey" match="root" use="name"/> <xsl:template match="/"> <xsl:for-each select="key('rootkey','second')"> <xsl:for-each select="item/something"> <xsl:if test="generate-id() = generate-id(key('item-by-value', normalize-space(.)))"> <xsl:value-of select="."/> </xsl:if> </xsl:for-each> </xsl:for-each> </xsl:template> </xsl:stylesheet> if i use "First" as the key to get only the first root i get a good result ABCD how ever if i use "second" i only get EF but i need the result to be ABDFE

Read the article
Efficient way to find unique elements in a vector compared against multiple vectors

- by SyncMaster

I am trying find the number of unique elements in a vector compared against multiple vectors using C++. Suppose I have, v1: 5, 8, 13, 16, 20 v2: 2, 4, 6, 8 v3: 20 v4: 1, 2, 3, 4, 5, 6, 7 v5: 1, 3, 5, 7, 11, 13, 15 The number of unique elements in v1 is 1 (i.e. number 16). I tried two approaches. Added vectors v2,v3,v4 and v5 into a vector of vector. For each element in v1, checked if the element is present in any of the other vectors. Combined all the vectors v2,v3,v4 and v5 using merge sort into a single vector and compared it against v1 to find the unique elements. Note: sample_vector = v1 and all_vectors_merged contains v2,v3,v4,v5 //Method 1 unsigned int compute_unique_elements_1(vector<unsigned int> sample_vector,vector<vector<unsigned int> > all_vectors_merged) { unsigned int duplicate = 0; for (unsigned int i = 0; i < sample_vector.size(); i++) { for (unsigned int j = 0; j < all_vectors_merged.size(); j++) { if (std::find(all_vectors_merged.at(j).begin(), all_vectors_merged.at(j).end(), sample_vector.at(i)) != all_vectors_merged.at(j).end()) { duplicate++; } } } return sample_vector.size()-duplicate; } // Method 2 unsigned int compute_unique_elements_2(vector<unsigned int> sample_vector, vector<unsigned int> all_vectors_merged) { unsigned int unique = 0; unsigned int i = 0, j = 0; while (i < sample_vector.size() && j < all_vectors_merged.size()) { if (sample_vector.at(i) > all_vectors_merged.at(j)) { j++; } else if (sample_vector.at(i) < all_vectors_merged.at(j)) { i++; unique ++; } else { i++; j++; } } if (i < sample_vector.size()) { unique += sample_vector.size() - i; } return unique; } Of these two techniques, I see that Method 2 gives faster results. 1) Method 1: Is there a more efficient way to find the elements than running std::find on all the vectors for all the elements in v1. 2) Method 2: Extra overhead in comparing vectors v2,v3,v4,v5 and sorting them. How can I do this in a better way?

Read the article
PHP mySQL - select unique value that not being used from dirrefent table

- by apis17

Updates : Please see below i have table: data +-----------------------+--------------+-----------+ | State | d_country | d_postcode| +-----------------------+--------------+-----------+ | State1 | Country1 | 1111 | | State2 | Country2 | 2222 | | State3 | Country3 | 3333 | | State4 | Country4 | 4444 | +-----------------------+--------------+-----------+ And another table: user +-----------------------+--------------+-----------+ | Name | u_country | u_postcode| +-----------------------+--------------+-----------+ | Name1 | Country3 | 3333 | | Name2 | Country5 | 5555 | | Name3 | | 6666 | | Name4 | Country6 | 6666 | | Name5 | Country6 | 6666 | +-----------------------+--------------+-----------+ What SQL should i use to: Determine the number (count) of country that are not listed on table data. For example u_postcode is not listed in d_postcode is 5555 and 6666. It will return 2. List down name and what country not available in table data yet. Updates I want to use grouping to filter postcode and make Name3 and Name4 as different rows. For example: +-----------------------+--------------+-----------+ | Name | u_country | u_postcode| +-----------------------+--------------+-----------+ | Name2 | Country5 | 5555 | | Name3 | | 6666 | | Name4 | Country6 | 6666 | +-----------------------+--------------+-----------+ Any possible idea?

Read the article
Expanding Rows for Unique Checkboxes

- by Marc Morgan

I was just recently given a project for my job to write a script that compares two mysql databases and print out information into an html table. Currently, I am trying to insert a checkbox by each individual's name and when selected, rows pertaining to that individual will expand underneath the person's name. I am combining javascript into my script to do this, although I really have no experience is it. The problem I am having is that when any checkbox is selected, all the rows for each individual is expanding instead of the rows pertaining only to the one individual selected. Here is the coding so far: <?php $link = mysql_connect ($server = "harris.lib.fit.edu", $username = "", $password = "") or die(mysql_error()); $db = mysql_select_db ("library-test") or die(mysql_error()); $ids = mysql_query("SELECT * FROM `ifc_studylog`") or die(mysql_error()); //not single quotes (tilda apostrophy) $x=0; $n=0; while($row = mysql_fetch_array( $ids )) { $tracksid1[$x] = $row['fitID']; $checkin[$x] = $row['checkin']; $checkout[$x] = $row['checkout']; $n++; $x++; } $names = mysql_query("SELECT * FROM `ifc_users`") or die(mysql_error()); //not single quotes (tilda apostrophy) $x=0; while($row = mysql_fetch_array( $names )) { $tracksnamefirst[$x] = $row['firstName']; $tracksnamesecond[$x] = $row['lastname']; $tracksid2[$x] = $row['fitID']; $tracksuser[$x] = $row['tracks']; $x++; } $x=0; foreach($tracksid2 as $comparename) { $chk = strval($x); ?> <script type='text/javascript' src='http://code.jquery.com/jquery-1.4.2.js'></script> <script type='text/javascript'> $(window).load(function () { $('.varx').click(function () { $('.text').toggle(this.checked); });}); </script> <?php echo '<td><input id = "<?=$chk?>" type="checkbox" class="varx" /></td>'; echo '<td align="center">'.$comparename.'</td>'; echo'<td align="center">'.$tracksnamefirst[$x].'</td>'; echo'<td align="center">'.$tracksnamesecond[$x].'</td>'; $z=0; foreach($tracksid1 as $compareid) { $HH=0; $MM =0; $SS =0; if($compareid == $comparename)// && $tracks==$tracksuser[$x]) { $SS = sprintf("%02s",(($checkout[$z]-$checkin[$z])%60)); $MM = sprintf("%02s",(($checkout[$z]-$checkin[$z])/60 %60)); $HH = sprintf("%02s",(($checkout[$z]-$checkin[$z])/3600 %24)); // echo'<td align="center">'.$HH.':'.$MM.':'.$SS.'</td>'; echo '</tr>'; echo '<tr>'; echo "<td id='txt' class='text' align='center' colspan='2' style='display:none'></td>"; echo "<td id='txt' class='text' align='center' style='display:none'>".$checkin[$z]."</td>"; echo '</tr>'; } echo '<tr>'; $z++; echo '</tr>'; } $x++; } } ?> Any Help is appreciated and sorry if I am too vague on the subject. The username and password is left off for security purposes.

Read the article
How can we get unique elements from any ORDER BY DECREASING OR INCREASING

- by Mohit

Code given below is taken from the stackoverflow.com !!! Can anyone tell me how to get the array elements order by decreaseing or increasing !! plz help me !!! Thanks in advance $contents = file_get_contents($htmlurl); // Get rid of style, script etc $search = array('@<script[^>]*?>.*?</script>@si', // Strip out javascript '@<head>.*?</head>@siU', // Lose the head section '@<style[^>]*?>.*?</style>@siU', // Strip style tags properly '@<![\s\S]*?--[ \t\n\r]*>@' // Strip multi-line comments including CDATA ); $contents = preg_replace($search, '', $contents); $result = array_count_values( str_word_count( strip_tags($contents), 1 ) ); print_r($result);

Read the article
jQuery all inputs with unique id

- by d3020

I have some textboxes on a webform that have ids like this: txtFinalDeadline_1 txtFinalDeadline_2 txtFinalDeadline_3 txtFinalDeadline_4 In my jQuery how do I find all of those in order to assign a value to them. Before I had the underscore and they were all named txtFinalDeadline I could do this and it worked. $(this).find("#txtFinalDeadline").val(formatDate); However, that was when they were all named the same thing. Now I have the _x after the name and I'm not sure how to go about assigning that same value as before to them. Thanks.

Read the article
Hibernate, select by id or unique column

- by Nican

I am using hibernate for Java, and I want to be able to select users by id or by name from the database. Nice thing about Hibernate is that it caches the result by id, but I seem to be unable to make it cache by name. static Session openSession = Factory.openSession(); public static User GetUser(int Id) { return (User) openSession.get(User.class, new Integer(Id)); } public static User GetUser( String Name ){ return (User) openSession.createCriteria( User.class ). add( Restrictions.eq("username", Name) ). uniqueResult(); } If I use GetUser(1) many times, hibernate will only show that it executed the first time. But every time I use GetUser("user1"), hibernate shows that it is executing a new query to database. What would be the best way to have the string identifier be cached also?

Read the article
Get a unique data in a SQL query

- by Jensen

Hi, I've a database who contain some datas in that form: icon(name, size, tag) (myicon.png, 16, 'twitter') (myicon.png, 32, 'twitter') (myicon.png, 128, 'twitter') (myicon.png, 256, 'twitter') (anothericon.png, 32, 'facebook') (anothericon.png, 128, 'facebook') (anothericon.png, 256, 'facebook') So as you see it, the name field is not uniq I can have multiple icons with the same name and they are separated with the size field. Now in PHP I have a query that get ONE icon set, for example : $dbQueryIcons = mysql_query("SELECT * FROM pl_icon WHERE tag LIKE '%".$SEARCH_QUERY."%' GROUP BY name ORDER BY id DESC LIMIT ".$firstEntry.", ".$CONFIG['icon_per_page']."") or die(mysql_error()); With this example if $tag contain 'twitter' it will show ONLY the first SQL data entry with the tag 'twitter', so it will be : (myicon.png, 16, 'twitter') This is what I want, but I would prefer the '128' size by default. Is this possible to tell SQL to send me only the 128 size when existing and if not another size ? In an another question someone give me a solution with the GROUP BY but in this case that don't run because we have a GROUP BY name. And if I delete the GROUP BY, it show me all size of the same icons. Thanks !

Read the article
Varchar with trailing spaces as a Primary Key in SQL Server 2008

- by cacaupt

Is it possible to have a varchar column as a primary key with values like 'a ' and 'a', is gives always this error "Violation of PRIMARY KEY constraint" in MS SQL Server 2008. In Oracle dons't give any error. BTW I'm not implementing this way I'm only trying to migrate the data from oracle to sql server. Regards

Read the article
Beware Sneaky Reads with Unique Indexes

- by Paul White NZ

A few days ago, Sandra Mueller (twitter | blog) asked a question using twitter’s #sqlhelp hash tag: “Might SQL Server retrieve (out-of-row) LOB data from a table, even if the column isn’t referenced in the query?” Leaving aside trivial cases (like selecting a computed column that does reference the LOB data), one might be tempted to say that no, SQL Server does not read data you haven’t asked for. In general, that’s quite correct; however there are cases where SQL Server might sneakily retrieve a LOB column… Example Table Here’s a T-SQL script to create that table and populate it with 1,000 rows: CREATE TABLE dbo.LOBtest ( pk INTEGER IDENTITY NOT NULL, some_value INTEGER NULL, lob_data VARCHAR(MAX) NULL, another_column CHAR(5) NULL, CONSTRAINT [PK dbo.LOBtest pk] PRIMARY KEY CLUSTERED (pk ASC) ); GO DECLARE @Data VARCHAR(MAX); SET @Data = REPLICATE(CONVERT(VARCHAR(MAX), 'x'), 65540); WITH Numbers (n) AS ( SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)) FROM master.sys.columns C1, master.sys.columns C2 ) INSERT LOBtest WITH (TABLOCKX) ( some_value, lob_data ) SELECT TOP (1000) N.n, @Data FROM Numbers N WHERE N.n <= 1000; Test 1: A Simple Update Let’s run a query to subtract one from every value in the some_value column: UPDATE dbo.LOBtest WITH (TABLOCKX) SET some_value = some_value - 1; As you might expect, modifying this integer column in 1,000 rows doesn’t take very long, or use many resources. The STATITICS IO and TIME output shows a total of 9 logical reads, and 25ms elapsed time. The query plan is also very simple: Looking at the Clustered Index Scan, we can see that SQL Server only retrieves the pk and some_value columns during the scan: The pk column is needed by the Clustered Index Update operator to uniquely identify the row that is being changed. The some_value column is used by the Compute Scalar to calculate the new value. (In case you are wondering what the Top operator is for, it is used to enforce SET ROWCOUNT). Test 2: Simple Update with an Index Now let’s create a nonclustered index keyed on the some_value column, with lob_data as an included column: CREATE NONCLUSTERED INDEX [IX dbo.LOBtest some_value (lob_data)] ON dbo.LOBtest (some_value) INCLUDE ( lob_data ) WITH ( FILLFACTOR = 100, MAXDOP = 1, SORT_IN_TEMPDB = ON ); This is not a useful index for our simple update query; imagine that someone else created it for a different purpose. Let’s run our update query again: UPDATE dbo.LOBtest WITH (TABLOCKX) SET some_value = some_value - 1; We find that it now requires 4,014 logical reads and the elapsed query time has increased to around 100ms. The extra logical reads (4 per row) are an expected consequence of maintaining the nonclustered index. The query plan is very similar to before (click to enlarge): The Clustered Index Update operator picks up the extra work of maintaining the nonclustered index. The new Compute Scalar operators detect whether the value in the some_value column has actually been changed by the update. SQL Server may be able to skip maintaining the nonclustered index if the value hasn’t changed (see my previous post on non-updating updates for details). Our simple query does change the value of some_data in every row, so this optimization doesn’t add any value in this specific case. The output list of columns from the Clustered Index Scan hasn’t changed from the one shown previously: SQL Server still just reads the pk and some_data columns. Cool. Overall then, adding the nonclustered index hasn’t had any startling effects, and the LOB column data still isn’t being read from the table. Let’s see what happens if we make the nonclustered index unique. Test 3: Simple Update with a Unique Index Here’s the script to create a new unique index, and drop the old one: CREATE UNIQUE NONCLUSTERED INDEX [UQ dbo.LOBtest some_value (lob_data)] ON dbo.LOBtest (some_value) INCLUDE ( lob_data ) WITH ( FILLFACTOR = 100, MAXDOP = 1, SORT_IN_TEMPDB = ON ); GO DROP INDEX [IX dbo.LOBtest some_value (lob_data)] ON dbo.LOBtest; Remember that SQL Server only enforces uniqueness on index keys (the some_data column). The lob_data column is simply stored at the leaf-level of the non-clustered index. With that in mind, we might expect this change to make very little difference. Let’s see: UPDATE dbo.LOBtest WITH (TABLOCKX) SET some_value = some_value - 1; Whoa! Now look at the elapsed time and logical reads: Scan count 1, logical reads 2016, physical reads 0, read-ahead reads 0, lob logical reads 36015, lob physical reads 0, lob read-ahead reads 15992. CPU time = 172 ms, elapsed time = 16172 ms. Even with all the data and index pages in memory, the query took over 16 seconds to update just 1,000 rows, performing over 52,000 LOB logical reads (nearly 16,000 of those using read-ahead). Why on earth is SQL Server reading LOB data in a query that only updates a single integer column? The Query Plan The query plan for test 3 looks a bit more complex than before: In fact, the bottom level is exactly the same as we saw with the non-unique index. The top level has heaps of new stuff though, which I’ll come to in a moment. You might be expecting to find that the Clustered Index Scan is now reading the lob_data column (for some reason). After all, we need to explain where all the LOB logical reads are coming from. Sadly, when we look at the properties of the Clustered Index Scan, we see exactly the same as before: SQL Server is still only reading the pk and some_value columns – so what’s doing the LOB reads? Updates that Sneakily Read Data We have to go as far as the Clustered Index Update operator before we see LOB data in the output list: [Expr1020] is a bit flag added by an earlier Compute Scalar. It is set true if the some_value column has not been changed (part of the non-updating updates optimization I mentioned earlier). The Clustered Index Update operator adds two new columns: the lob_data column, and some_value_OLD. The some_value_OLD column, as the name suggests, is the pre-update value of the some_value column. At this point, the clustered index has already been updated with the new value, but we haven’t touched the nonclustered index yet. An interesting observation here is that the Clustered Index Update operator can read a column into the data flow as part of its update operation. SQL Server could have read the LOB data as part of the initial Clustered Index Scan, but that would mean carrying the data through all the operations that occur prior to the Clustered Index Update. The server knows it will have to go back to the clustered index row to update it, so it delays reading the LOB data until then. Sneaky! Why the LOB Data Is Needed This is all very interesting (I hope), but why is SQL Server reading the LOB data? For that matter, why does it need to pass the pre-update value of the some_value column out of the Clustered Index Update? The answer relates to the top row of the query plan for test 3. I’ll reproduce it here for convenience: Notice that this is a wide (per-index) update plan. SQL Server used a narrow (per-row) update plan in test 2, where the Clustered Index Update took care of maintaining the nonclustered index too. I’ll talk more about this difference shortly. The Split/Sort/Collapse combination is an optimization, which aims to make per-index update plans more efficient. It does this by breaking each update into a delete/insert pair, reordering the operations, removing any redundant operations, and finally applying the net effect of all the changes to the nonclustered index. Imagine we had a unique index which currently holds three rows with the values 1, 2, and 3. If we run a query that adds 1 to each row value, we would end up with values 2, 3, and 4. The net effect of all the changes is the same as if we simply deleted the value 1, and added a new value 4. By applying net changes, SQL Server can also avoid false unique-key violations. If we tried to immediately update the value 1 to a 2, it would conflict with the existing value 2 (which would soon be updated to 3 of course) and the query would fail. You might argue that SQL Server could avoid the uniqueness violation by starting with the highest value (3) and working down. That’s fine, but it’s not possible to generalize this logic to work with every possible update query. SQL Server has to use a wide update plan if it sees any risk of false uniqueness violations. It’s worth noting that the logic SQL Server uses to detect whether these violations are possible has definite limits. As a result, you will often receive a wide update plan, even when you can see that no violations are possible. Another benefit of this optimization is that it includes a sort on the index key as part of its work. Processing the index changes in index key order promotes sequential I/O against the nonclustered index. A side-effect of all this is that the net changes might include one or more inserts. In order to insert a new row in the index, SQL Server obviously needs all the columns – the key column and the included LOB column. This is the reason SQL Server reads the LOB data as part of the Clustered Index Update. In addition, the some_value_OLD column is required by the Split operator (it turns updates into delete/insert pairs). In order to generate the correct index key delete operation, it needs the old key value. The irony is that in this case the Split/Sort/Collapse optimization is anything but. Reading all that LOB data is extremely expensive, so it is sad that the current version of SQL Server has no way to avoid it. Finally, for completeness, I should mention that the Filter operator is there to filter out the non-updating updates. Beating the Set-Based Update with a Cursor One situation where SQL Server can see that false unique-key violations aren’t possible is where it can guarantee that only one row is being updated. Armed with this knowledge, we can write a cursor (or the WHILE-loop equivalent) that updates one row at a time, and so avoids reading the LOB data: SET NOCOUNT ON; SET STATISTICS XML, IO, TIME OFF; DECLARE @PK INTEGER, @StartTime DATETIME; SET @StartTime = GETUTCDATE(); DECLARE curUpdate CURSOR LOCAL FORWARD_ONLY KEYSET SCROLL_LOCKS FOR SELECT L.pk FROM LOBtest L ORDER BY L.pk ASC; OPEN curUpdate; WHILE (1 = 1) BEGIN FETCH NEXT FROM curUpdate INTO @PK; IF @@FETCH_STATUS = -1 BREAK; IF @@FETCH_STATUS = -2 CONTINUE; UPDATE dbo.LOBtest SET some_value = some_value - 1 WHERE CURRENT OF curUpdate; END; CLOSE curUpdate; DEALLOCATE curUpdate; SELECT DATEDIFF(MILLISECOND, @StartTime, GETUTCDATE()); That completes the update in 1280 milliseconds (remember test 3 took over 16 seconds!) I used the WHERE CURRENT OF syntax there and a KEYSET cursor, just for the fun of it. One could just as well use a WHERE clause that specified the primary key value instead. Clustered Indexes A clustered index is the ultimate index with included columns: all non-key columns are included columns in a clustered index. Let’s re-create the test table and data with an updatable primary key, and without any non-clustered indexes: IF OBJECT_ID(N'dbo.LOBtest', N'U') IS NOT NULL DROP TABLE dbo.LOBtest; GO CREATE TABLE dbo.LOBtest ( pk INTEGER NOT NULL, some_value INTEGER NULL, lob_data VARCHAR(MAX) NULL, another_column CHAR(5) NULL, CONSTRAINT [PK dbo.LOBtest pk] PRIMARY KEY CLUSTERED (pk ASC) ); GO DECLARE @Data VARCHAR(MAX); SET @Data = REPLICATE(CONVERT(VARCHAR(MAX), 'x'), 65540); WITH Numbers (n) AS ( SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)) FROM master.sys.columns C1, master.sys.columns C2 ) INSERT LOBtest WITH (TABLOCKX) ( pk, some_value, lob_data ) SELECT TOP (1000) N.n, N.n, @Data FROM Numbers N WHERE N.n <= 1000; Now here’s a query to modify the cluster keys: UPDATE dbo.LOBtest SET pk = pk + 1; The query plan is: As you can see, the Split/Sort/Collapse optimization is present, and we also gain an Eager Table Spool, for Halloween protection. In addition, SQL Server now has no choice but to read the LOB data in the Clustered Index Scan: The performance is not great, as you might expect (even though there is no non-clustered index to maintain): Table 'LOBtest'. Scan count 1, logical reads 2011, physical reads 0, read-ahead reads 0, lob logical reads 36015, lob physical reads 0, lob read-ahead reads 15992. Table 'Worktable'. Scan count 1, logical reads 2040, physical reads 0, read-ahead reads 0, lob logical reads 34000, lob physical reads 0, lob read-ahead reads 8000. SQL Server Execution Times: CPU time = 483 ms, elapsed time = 17884 ms. Notice how the LOB data is read twice: once from the Clustered Index Scan, and again from the work table in tempdb used by the Eager Spool. If you try the same test with a non-unique clustered index (rather than a primary key), you’ll get a much more efficient plan that just passes the cluster key (including uniqueifier) around (no LOB data or other non-key columns): A unique non-clustered index (on a heap) works well too: Both those queries complete in a few tens of milliseconds, with no LOB reads, and just a few thousand logical reads. (In fact the heap is rather more efficient). There are lots more fun combinations to try that I don’t have space for here. Final Thoughts The behaviour shown in this post is not limited to LOB data by any means. If the conditions are met, any unique index that has included columns can produce similar behaviour – something to bear in mind when adding large INCLUDE columns to achieve covering queries, perhaps. Paul White Email: [email protected] Twitter: @PaulWhiteNZ

Read the article
SQL SERVER – Validating Unique Columnname Across Whole Database

- by pinaldave

I sometimes come across very strange requirements and often I do not receive a proper explanation of the same. Here is the one of those examples. Asker: “Our business requirement is when we add new column we want it unique across current database.” Pinal: “Why do you have such requirement?” Asker: “Do you know the solution?” Pinal: “Sure I can come up with the answer but it will help me to come up with an optimal answer if I know the business need.” Asker: “Thanks – what will be the answer in that case.” Pinal: “Honestly I am just curious about the reason why you need your column name to be unique across database.” (Silence) Pinal: “Alright – here is the answer – I guess you do not want to tell me reason.” Option 1: Check if Column Exists in Current Database IF EXISTS ( SELECT * FROM sys.columns WHERE Name = N'NameofColumn') BEGIN SELECT 'Column Exists' -- add other logic END ELSE BEGIN SELECT 'Column Does NOT Exists' -- add other logic END Option 2: Check if Column Exists in Current Database in Specific Table IF EXISTS ( SELECT * FROM sys.columns WHERE Name = N'NameofColumn' AND OBJECT_ID = OBJECT_ID(N'tableName')) BEGIN SELECT 'Column Exists' -- add other logic END ELSE BEGIN SELECT 'Column Does NOT Exists' -- add other logic END I guess user did not want to share the reason why he had a unique requirement of having column name unique across databases. Here is my question back to you – have you faced a similar situation ever where you needed unique column name across a database. If not, can you guess what could be the reason for this kind of requirement? Additional Reference: SQL SERVER – Query to Find Column From All Tables of Database Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL System Table, SQL Tips and Tricks, T SQL, Technology

Read the article
Google Analytics: Do unique events report as unique visits when triggered on pages other than your own domain?

- by Jesse Gardner

We just recently attached a SWF to our Brightcove video player to report various events back to Google Analytics. We're also tracking page views with a standard GA snippet on the page where the player is embedded. As I understand it, because a unique has already been recorded for the page, any event being triggered by the player is getting associated with that unique. However, we allow people to embed the video player on other websites. All of the event data started pouring into the Events section as expected, but we noticed a dramatic uptick in unique visitors on the site (nearly double) while the pageview count stayed relatively unchanged. Disabling event tracking brought the traffic back down to average levels. I should also add that in the Pages section of Event tracking we're seeing URLs for other sites where the player has been embedded; but this data isn't showing up in the Content section. It seems counterintuitive, but does GA count an event fired as a unique visit even if it's triggered from some place other than your website? Is so, there any way to trigger an event in the events section without it reporting to the unique visitor count?

Read the article
Prolog Beginner: How to make unique values for each Variable in a predicate.

- by sixtyfootersdude

I have a prolog predicate: DoStuff( [A|B] ) :- <Stuff that I do> ... </Stuff that I do> It is all done except it needs to do return unique values. Ie if you do: ?- DoStuff(A,B,C,D). it should return: A=1; B=2; C=3; D=4. (Or something similar, the key point is that all of the values are unique). However you should be able to do this too: ?- DoStuff(A,A,B,B). And still get a valid answer. Ie: A=1; B=2. How can I do this? What I was planning on doing was something like this: DoStuff( [A|B] ) :- <Stuff that I do> ... </Stuff that I do> unique([A|B]). unique([]). unique([A|B]) :- A is not B. However I think that will make DoStuff([A,A,B]) not work because not all values will be unique.

Read the article
Deterministic/Consistent Unique Masking

- by Dinesh Rajasekharan-Oracle

One of the key requirements while masking data in large databases or multi database environment is to consistently mask some columns, i.e. for a given input the output should always be the same. At the same time the masked output should not be predictable. Deterministic masking also eliminates the need to spend enormous amount of time spent in identifying data relationships, i.e. parent and child relationships among columns defined in the application tables. In this blog post I will explain different ways of consistently masking the data across databases using Oracle Data Masking and Subsetting The readers of post should have minimal knowledge on Oracle Enterprise Manager 12c, Application Data Modeling, Data Masking concepts. For more information on these concepts, please refer to Oracle Data Masking and Subsetting document Oracle Data Masking and Subsetting 12c provides four methods using which users can consistently yet irreversibly mask their inputs. 1. Substitute 2. SQL Expression 3. Encrypt 4. User Defined Function SUBSTITUTE The substitute masking format replaces the original value with a value from a pre-created database table. As the method uses a hash based algorithm in the back end the mappings are consistent. For example consider DEPARTMENT_ID in EMPLOYEES table is replaced with FAKE_DEPARTMENT_ID from FAKE_TABLE. The substitute masking transformation that all occurrences of DEPARTMENT_ID say ‘101’ will be replaced with ‘502’ provided same substitution table and column is used , i.e. FAKE_TABLE.FAKE_DEPARTMENT_ID. The following screen shot shows the usage of the Substitute masking format with in a masking definition: Note that the uniqueness of the masked value depends on the number of columns being used in the substitution table i.e. if the original table contains 50000 unique values, then for the masked output to be unique and deterministic the substitution column should also contain 50000 unique values without which only consistency is maintained but not uniqueness. SQL EXPRESSION SQL Expression replaces an existing value with the output of a specified SQL Expression. For example while masking an EMPLOYEES table the EMAIL_ID of an employee has to be in the format EMPLOYEE’s [email protected] while FIRST_NAME and LAST_NAME are the actual column names of the EMPLOYEES table then the corresponding SQL Expression will look like %FIRST_NAME%||’.’||%LAST_NAME%||’@COMPANY.COM’. The advantage of this technique is that if you are masking FIRST_NAME and LAST_NAME of the EMPLOYEES table than the corresponding EMAIL ID will be replaced accordingly by the masking scripts. One of the interesting aspect’s of a SQL Expressions is that you can use sub SQL expressions, which means that you can write a nested SQL and use it as SQL Expression to address a complex masking business use cases. SQL Expression can also be used to consistently replace value with hashed value using Oracle’s PL/SQL function ORA_HASH. The following SQL Expression will help in the previous example for replacing the DEPARTMENT_IDs with a hashed number ORA_HASH (%DEPARTMENT_ID%, 1000) The following screen shot shows the usage of encrypt masking format with in the masking definition: ORA_HASH takes three arguments: 1. Expression which can be of any data type except LONG, LOB, User Defined Type [nested table type is allowed]. In the above example I used the Original value as expression. 2. Number of hash buckets which can be number between 0 and 4294967295. The default value is 4294967295. You can also co-relate the number of hash buckets to a range of numbers. In the above example above the bucket value is specified as 1000, so the end result will be a hashed number in between 0 and 1000. 3. Seed, can be any number which decides the consistency, i.e. for a given seed value the output will always be same. The default seed is 0. In the above SQL Expression a seed in not specified, so it to 0. If you have to use a non default seed then the function will look like. ORA_HASH (%DEPARTMENT_ID%, 1000, 1234 The uniqueness depends on the input and the number of hash buckets used. However as ORA_HASH uses a 32 bit algorithm, considering birthday paradox or pigeonhole principle there is a 0.5 probability of collision after 232-1 unique values. ENCRYPT Encrypt masking format uses a blend of 3DES encryption algorithm, hashing, and regular expression to produce a deterministic and unique masked output. The format of the masked output corresponds to the specified regular expression. As this technique uses a key [string] to encrypt the data, the same string can be used to decrypt the data. The key also acts as seed to maintain consistent outputs for a given input. The following screen shot shows the usage of encrypt masking format with in the masking definition: Regular Expressions may look complex for the first time users but you will soon realize that it’s a simple language. There are many resources in internet, oracle documentation, oracle learning library, my oracle support on writing a Regular Expressions, out of all the following My Oracle Support document helped me to get started with Regular Expressions: Oracle SQL Support for Regular Expressions[Video](Doc ID 1369668.1) USER DEFINED FUNCTION [UDF] User Defined Function or UDF provides flexibility for the users to code their own masking logic in PL/SQL, which can be called from masking Defintion. The standard format of an UDF in Oracle Data Masking and Subsetting is: Function udf_func (rowid varchar2, column_name varchar2, original_value varchar2) returns varchar2; Where • rowid is the row identifier of the column that needs to be masked • column_name is the name of the column that needs to be masked • original_value is the column value that needs to be masked You can achieve deterministic masking by using Oracle’s built in hash functions like, ORA_HASH, DBMS_CRYPTO.MD4, DBMS_CRYPTO.MD5, DBMS_UTILITY. GET_HASH_VALUE.Please refers to the Oracle Database Documentation for more information on the Oracle Hash functions. For example the following masking UDF generate deterministic unique hexadecimal values for a given string input: CREATE OR REPLACE FUNCTION RD_DUX (rid varchar2, column_name varchar2, orig_val VARCHAR2) RETURN VARCHAR2 DETERMINISTIC PARALLEL_ENABLE IS stext varchar2 (26); no_of_characters number(2); BEGIN no_of_characters:=6; stext:=substr(RAWTOHEX(DBMS_CRYPTO.HASH(UTL_RAW.CAST_TO_RAW(text),1)),0,no_of_characters); RETURN stext; END; The uniqueness depends on the input and length of the string and number of bits used by hash algorithm. In the above function MD4 hash is used [denoted by argument 1 in the DBMS_CRYPTO.HASH function which is a 128 bit algorithm which produces 2^128-1 unique hashed values , however this is limited by the length of the input string which is 6, so only 6^6 unique values will be generated. Also do not forget about the birthday paradox/pigeonhole principle mentioned earlier in this post. An another example is to consistently replace characters or numbers preserving the length and special characters as shown below: CREATE OR REPLACE FUNCTION RD_DUS(rid varchar2,column_name varchar2,orig_val VARCHAR2) RETURN VARCHAR2 DETERMINISTIC PARALLEL_ENABLE IS stext varchar2(26); BEGIN DBMS_RANDOM.SEED(orig_val); stext:=TRANSLATE(orig_val,'ABCDEFGHILKLMNOPQRSTUVWXYZ',DBMS_RANDOM.STRING('U',26)); stext:=TRANSLATE(stext,'abcdefghijklmnopqrstuvwxyz',DBMS_RANDOM.STRING('L',26)); stext:=TRANSLATE(stext,'0123456789',to_char(DBMS_RANDOM.VALUE(1,9))); stext:=REPLACE(stext,'.','0'); RETURN stext; END; The following screen shot shows the usage of an UDF with in a masking definition: To summarize, Oracle Data Masking and Subsetting helps you to consistently mask data across databases using one or all of the methods described in this post. It saves the hassle of identifying the parent-child relationships defined in the application table. Happy Masking

Read the article
how to make the StringListProperty's value unique in google-app-engine.

- by zjm1126

can do this ????: class Thread(db.Model): members = db.StringListProperty(unique =True) thanks

Read the article
Tracking unique views for a site showing my advertisements [on hold]

- by user580950

I am in trouble. I placed and advertisement on a website in 2012. The website said they got 950,000 unique visits each month. Early in 2012 I advertised with them. The advertisement didn't worked out. I checked in 2-3 months time and I saw that the unique visitors on the site was 8,000 at that time. I immediately closed the account. I don't remember which site I used to check the unique visitors. The advertising company has filed a dispute against me. So is there any tool that can show me the 2012 stats for any website? I tried Google Trends but it doesn't show statistics.

Read the article
Tracking Unique site Views for 2012 - Not my website

- by user580950

I am in trouble. I placed and advt on a website in 2012 which said he has 950,000 unique visits each month so early in 2012 i advertised with them. The advertised didn't worked out so checked in 2-3 months time and i saw that the unique visitors on their site was 8,000 at that time.I immediately close the account I dont remember which site i was checking the unique visitors.That advt company has filed a dispute against me. So is there any tool that give me stats of 2012 of any website. i tried google trends but it doesnt show statistics .

Read the article
How to keep a generic process unique?

- by Steve Van Opstal

I'm currently working on a project that makes connection between different banks which send us information on which that project replies. A part of that project configures the different protocols that are used (not every bank uses the same protocol), this runs on a separate server. These processes all have unique id's which are stored in a database. But to save time and money on configurations and new processes, we want to make a generic protocol that banks can use. Because of PCI requirements we have to make a separate process for every bank we connect to. But the generic process has only 1 unique identifier and therefor we cannot keep them apart. Giving every copy of that process a different identifier is as I see it impossible because they run entirely separate. So how do I keep my generic process unique?

Read the article
Generating Unique Content - How to Do it For Your Website

Well, the first question is, why do you need to have unique content on your website? It's really all down to the search engines; they want original content which ends in unique results, thus making it a much better experience for the users and hopefully ensuring that they will return to the site again.

Read the article
counting unique values based on multiple columns

- by gooogalizer

I am working in google spreadsheets and I am trying to do some counting that takes into consideration cell values across multiple cells in each row. Here's my table: |AUTHOR| |ARTICLE| |VERSION| |PRE-SELECTED| ANDREW GOLF STREAM 1 X ANDREW GOLF STREAM 2 X ANDREW HURRICANES 1 JOHN CAPE COD 1 X JOHN GOLF STREAM 1 (Google doc here) Each person can submit multiple articles as well as multiple versions of the same article. Sometimes different people submit different articles that happen to be identically named (Andrew and John both submitted different articles called "Golf Stream"). Multiple versions written by the same person do not count as unique, but articles with the same title written by different people do count as unique. So, I am looking to find a formula that Counts the number of unique articles that have been submitted [4] (without having to manually create extra columns for doing CONCATS, if possible) It would also be great to find formulas that: Count the number of unique articles that have been pre-selected (marked "X" in "PRE-SELECTED" column) [2] Count the number of unique articles that have only 1 version [4] Count the number of unique articles that have more than 1 of their versions pre-selected 1 Thank you so much! Nikita

Read the article
Python - Why use anything other than uuid4() for unique strings?

- by orokusaki

I see quit a few implementations of unique string generation for things like uploaded image names, session IDs, et al, and many of them employ the usage of hashes like SHA1, or others. I'm not questioning the legitimacy of using custom methods like this, but rather just the reason. If I want a unique string, I just say this: >>> import uuid >>> uuid.uuid4() 07033084-5cfd-4812-90a4-e4d24ffb6e3d And I'm done with it. I wasn't very trusting before I read up on uuid, so I did this: >>> import uuid >>> s = set() >>> for i in range(5000000): # That's 5 million! >>> s.add(uuid.uuid4()) ... ... >>> len(s) 5000000 Not one repeater (I didn't expect one considering the odds are like 1.108e+50, but it's comforting to see it in action). You could even half the odds by just making your string by combining 2 uuid4()s. So, with that said, why do people spend time on random() and other stuff for unique strings, etc? Is there an important security issue or other regarding uuid?

Read the article
how to generate unique numbers less than 8 characters long.

- by loudiyimo

hi I want to generate unique id's everytime i call methode generateCustumerId(). The generated id must be 8 characters long or less than 8 characters. This requirement is necessary because I need to store it in a data file and schema is determined to be 8 characters long for this id. Option 1 works fine. Instead of option 1, I want to use UUID. The problem is that UUID generates an id which has to many characters. Does someone know how to generate a unique id which is less then 99999999? option 1 import java.util.HashSet; import java.util.Random; import java.util.Set; public class CustomerIdGenerator { private static Set<String> customerIds = new HashSet<String>(); private static Random random = new Random(); // XXX: replace with java.util.UUID public static String generateCustumerId() { String customerId = null; while (customerId == null || customerIds.contains(customerId)) { customerId = String.valueOf(random.nextInt(89999999) + 10000000); } customerIds.add(customerId); return customerId; } } option2 generates an unique id which is too long public static String generateCustumerId() { String ownerId = UUID.randomUUID().toString(); System.out.println("ownerId " + ownerId); return ownerId }

Read the article
How to map string keys to unique integer IDs?

- by Marek

I have some data that comes regularily as a dump from a data souce with a string natural key that is long (up to 60 characters) and not relevant to the end user. I am using this key in a url. This makes urls too long and user unfriendly. I would like to transform the string keys into integers with the following requirements: The source dataset will change over time. The ID should be: non negative integer unique and constant even if the set of input keys changes preferrably reversible back to key (not a strong requirement) The database is rebuilt from scratch every time so I can not remember the already assigned IDs and match the new data set to existing IDs and generate sequential IDs for the added keys. There are currently around 30000 distinct keys and the set is constantly growing. How to implement a function that will map string keys to integer IDs? What I have thought about: 1. Built-in string.GetHashCode: ID(key) = Math.Abs(key.GetHashCode()) is not guaranteed to be unique (not reversible) 1.1 "Re-hashing" the built-in GetHashCode until a unique ID is generated to prevent collisions. existing IDs may change if something colliding is added to the beginning of the input data set 2. a perfect hashing function I am not sure if this can generate constant IDs if the set of inputs changes (not reversible) 3. translate to base 36/64/?? does not shorten the long keys enough What are the other options?

Read the article

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >