buckets - Page 3 - Developer IT

extendible hashing

- by Phenom

I need to make a program that shows the hash value of a given key, using extendible hashing. In extendible hashing, I know that the buckets split and directories change. So if I make my program, do I have to already know things like if the bucket it hashes to is filled, or do I not have to worry about those things and just compute a hash value based on the key?

Read the article

Squid w/ SquidGuard fails w/ "Too few redirector processes are running"

- by DKNUCKLES

I'm trying to implement a Squid proxy in a quick and easy fashion and I'm receiving some errors I have been unable to resolve. The box is a pre-made appliance, however it seems to fail on launch.The following is the cache.log file when I attempt to launch the squid service. 2012/11/18 22:14:29| Starting Squid Cache version 3.0.STABLE20-20091201 for i686 -pc-linux-gnu... 2012/11/18 22:14:29| Process ID 12647 2012/11/18 22:14:29| With 1024 file descriptors available 2012/11/18 22:14:29| Performing DNS Tests... 2012/11/18 22:14:29| Successful DNS name lookup tests... 2012/11/18 22:14:29| DNS Socket created at 0.0.0.0, port 40513, FD 8 2012/11/18 22:14:29| Adding nameserver 192.168.0.78 from /etc/resolv.conf 2012/11/18 22:14:29| Adding nameserver 8.8.8.8 from /etc/resolv.conf 2012/11/18 22:14:29| helperOpenServers: Starting 5/5 'bin' processes 2012/11/18 22:14:29| ipcCreate: /opt/squidguard/bin: (13) Permission denied 2012/11/18 22:14:29| ipcCreate: /opt/squidguard/bin: (13) Permission denied 2012/11/18 22:14:29| ipcCreate: /opt/squidguard/bin: (13) Permission denied 2012/11/18 22:14:29| ipcCreate: /opt/squidguard/bin: (13) Permission denied 2012/11/18 22:14:29| ipcCreate: /opt/squidguard/bin: (13) Permission denied 2012/11/18 22:14:29| helperOpenServers: Starting 5/5 'squid-auth.pl' processes 2012/11/18 22:14:29| User-Agent logging is disabled. 2012/11/18 22:14:29| Referer logging is disabled. 2012/11/18 22:14:29| Unlinkd pipe opened on FD 23 2012/11/18 22:14:29| Swap maxSize 10240000 + 8192 KB, estimated 788322 objects 2012/11/18 22:14:29| Target number of buckets: 39416 2012/11/18 22:14:29| Using 65536 Store buckets 2012/11/18 22:14:29| Max Mem size: 8192 KB 2012/11/18 22:14:29| Max Swap size: 10240000 KB 2012/11/18 22:14:29| Version 1 of swap file with LFS support detected... 2012/11/18 22:14:29| Rebuilding storage in /opt/squid3/var/cache (DIRTY) 2012/11/18 22:14:29| Using Least Load store dir selection 2012/11/18 22:14:29| Set Current Directory to /opt/squid3/var/cache 2012/11/18 22:14:29| Loaded Icons. 2012/11/18 22:14:29| Accepting HTTP connections at 10.0.0.6, port 3128, FD 25. 2012/11/18 22:14:29| Accepting ICP messages at 0.0.0.0, port 3130, FD 26. 2012/11/18 22:14:29| HTCP Disabled. 2012/11/18 22:14:29| Ready to serve requests. 2012/11/18 22:14:29| Done reading /opt/squid3/var/cache swaplog (0 entries) 2012/11/18 22:14:29| Finished rebuilding storage from disk. 2012/11/18 22:14:29| 0 Entries scanned 2012/11/18 22:14:29| 0 Invalid entries. 2012/11/18 22:14:29| 0 With invalid flags. 2012/11/18 22:14:29| 0 Objects loaded. 2012/11/18 22:14:29| 0 Objects expired. 2012/11/18 22:14:29| 0 Objects cancelled. 2012/11/18 22:14:29| 0 Duplicate URLs purged. 2012/11/18 22:14:29| 0 Swapfile clashes avoided. 2012/11/18 22:14:29| Took 0.02 seconds ( 0.00 objects/sec). 2012/11/18 22:14:29| Beginning Validation Procedure 2012/11/18 22:14:29| WARNING: redirector #1 (FD 9) exited 2012/11/18 22:14:29| WARNING: redirector #2 (FD 10) exited 2012/11/18 22:14:29| WARNING: redirector #3 (FD 11) exited 2012/11/18 22:14:29| WARNING: redirector #4 (FD 12) exited 2012/11/18 22:14:29| Too few redirector processes are running FATAL: The redirector helpers are crashing too rapidly, need help! Squid Cache (Version 3.0.STABLE20-20091201): Terminated abnormally. CPU Usage: 0.112 seconds = 0.032 user + 0.080 sys Maximum Resident Size: 0 KB Page faults with physical i/o: 0 Memory usage for squid via mallinfo(): total space in arena: 2944 KB Ordinary blocks: 2857 KB 6 blks Small blocks: 0 KB 0 blks Holding blocks: 1772 KB 8 blks Free Small blocks: 0 KB Free Ordinary blocks: 86 KB Total in use: 4629 KB 157% Total free: 86 KB 3% The "permission denied" area is where I have been focusing my attention with no luck. The following is what I've tried. Chmod'ing the /opt/squidguard/bin folder to 777 Changing the user that squidguard runs under to root / nobody / www-data / squid3 Tried changing ownership of the /opt/squidguard/bin folder to all names listed above after assigning that user to run with squid. Any help with this would be greatly appreciated.

Read the article

32bit domU on 64bit dom0

- by ModuleC

I'm using 64bit Centos Dom0 with: 2.6.18-164.15.1.el5xen #1 SMP Wed Mar 17 12:04:23 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux recently i migrated some 32bit Centos domUs on this node. As by specs 32bit domUs should work with 64bit dom0. DomUs are paravirtualized, and everything works, except iptables limit. Anyways runing csf on domU will return following messages in dmesg: ip_tables: (C) 2000-2006 Netfilter Core Team Netfilter messages via NETLINK v0.30. ip_conntrack version 2.4 (2080 buckets, 16640 max) - 304 bytes per conntrack ip_tables: limit match: invalid size 40 != 28 ip_tables: limit match: invalid size 40 != 28 ip_tables: limit match: invalid size 40 != 28 ip_tables: limit match: invalid size 40 != 28 ip_tables: limit match: invalid size 40 != 28 ip_tables: limit match: invalid size 40 != 28 ip_tables: limit match: invalid size 40 != 28 Doing lsmod on both dom0 and domU is listing all iptables modules required as loaded. I've found this http://www.mail-archive.com/[email protected]/msg189433.html But didn't find anything for centos on this issue. Am I missing something?

Read the article

squid3 auth thru samba using ntlm to AD doesn't work

- by derty

some users here are spending to much time exploring the WWW. So big boss whats to get this under control. We use a squid3 just for some security reason and chace benefits. and now i'm trying to set up a new proxy on a different server (Debian 6) Permissions are defined in AC and the squid3 should get the auth thru samba/winbind by using the ntlm protocol. but i'll get all the time Access, denited. it only works by using LDAP but thats not the way i need it. here some log and confs squid access.log 1326878095.784 1 192.168.15.27 TCP_DENIED/407 4049 GET http://at.msn.com/? -NONE/- text/html 1326878095.791 1 192.168.15.27 TCP_DENIED/407 4294 GET http://at.msn.com/? - NONE/- text/html 1326878095.803 9 192.168.15.27 TCP_DENIED/403 4028 GET http://at.msn.com/? kavan NONE/- text/html 1326878095.848 0 192.168.15.27 TCP_DENIED/403 3881 GET http://www.squid-cache.org/Artwork/SN.png kavan NONE/- text/html 1326878100.279 0 192.168.15.27 TCP_DENIED/403 3735 GET http://www.google.at/ kavan NONE/- text/html 1326878100.296 0 192.168.15.27 TCP_DENIED/403 3870 GET http://www.squid-cache.org/Artwork/SN.png kavan NONE/- text/html 1326878155.700 0 192.168.15.27 TCP_DENIED/407 4072 GET http://ie9cvlist.ie.microsoft.com/IE9CompatViewList.xml - NONE/- text/html 1326878155.705 2 192.168.15.27 TCP_DENIED/407 4317 GET http://ie9cvlist.ie.microsoft.com/IE9CompatViewList.xml - NONE/- text/html 1326878155.709 3 192.168.15.27 TCP_DENIED/403 4026 GET http://ie9cvlist.ie.microsoft.com/IE9CompatViewList.xml kavan NONE/- text/html squid chace 2012/01/18 10:12:49| Creating Swap Directories 2012/01/18 10:12:49| Starting Squid Cache version 3.1.6 for x86_64-pc-linux-gnu... 2012/01/18 10:12:49| Process ID 17236 2012/01/18 10:12:49| With 65535 file descriptors available 2012/01/18 10:12:49| Initializing IP Cache... 2012/01/18 10:12:49| DNS Socket created at [::], FD 7 2012/01/18 10:12:49| DNS Socket created at 0.0.0.0, FD 8 2012/01/18 10:12:49| Adding nameserver 192.168.15.2 from /etc/resolv.conf 2012/01/18 10:12:49| Adding nameserver 192.168.15.19 from /etc/resolv.conf 2012/01/18 10:12:49| Adding nameserver 192.168.15.1 from /etc/resolv.conf 2012/01/18 10:12:49| Adding domain schoenbrunn.local from /etc/resolv.conf 2012/01/18 10:12:49| helperOpenServers: Starting 5/5 'squid_ldap_auth' processes 2012/01/18 10:12:49| helperOpenServers: Starting 10/10 'ntlm_auth' processes 2012/01/18 10:12:49| helperOpenServers: Starting 10/10 'squid_kerb_auth' processes 2012/01/18 10:12:49| squid_kerb_auth: INFO: Starting version 1.0.5 2012/01/18 10:12:49| squid_kerb_auth: INFO: Starting version 1.0.5 2012/01/18 10:12:49| squid_kerb_auth: INFO: Starting version 1.0.5 2012/01/18 10:12:49| squid_kerb_auth: INFO: Starting version 1.0.5 2012/01/18 10:12:49| squid_kerb_auth: INFO: Starting version 1.0.5 2012/01/18 10:12:49| squid_kerb_auth: INFO: Starting version 1.0.5 2012/01/18 10:12:49| squid_kerb_auth: INFO: Starting version 1.0.5 2012/01/18 10:12:49| squid_kerb_auth: INFO: Starting version 1.0.5 2012/01/18 10:12:49| helperOpenServers: Starting 5/5 'squid_ldap_group' processes 2012/01/18 10:12:49| squid_kerb_auth: INFO: Starting version 1.0.5 2012/01/18 10:12:49| squid_kerb_auth: INFO: Starting version 1.0.5 2012/01/18 10:12:49| Unlinkd pipe opened on FD 73 2012/01/18 10:12:49| Local cache digest enabled; rebuild/rewrite every 3600/3600 sec 2012/01/18 10:12:49| Store logging disabled 2012/01/18 10:12:49| Swap maxSize 0 + 262144 KB, estimated 20164 objects 2012/01/18 10:12:49| Target number of buckets: 1008 2012/01/18 10:12:49| Using 8192 Store buckets 2012/01/18 10:12:49| Max Mem size: 262144 KB 2012/01/18 10:12:49| Max Swap size: 0 KB 2012/01/18 10:12:49| Using Least Load store dir selection 2012/01/18 10:12:49| Set Current Directory to /var/spool/squid3 2012/01/18 10:12:49| Loaded Icons. 2012/01/18 10:12:49| Accepting HTTP connections at [::]:3128, FD 74. 2012/01/18 10:12:49| HTCP Disabled. 2012/01/18 10:12:49| Squid modules loaded: 0 2012/01/18 10:12:49| Adaptation support is off. 2012/01/18 10:12:49| Ready to serve requests. 2012/01/18 10:12:50| storeLateRelease: released 0 objects smb.conf # Domain Authntication Settings workgroup = <WORKGROUP> security = ads password server = <DOMAINNAME>.LOCAL realm = <DOMAINNAME>.LOCAL ldap ssl = no # logging log level = 5 max log size = 50 # logs split per machine log file = /var/log/samba/%m.log # max 50KB per log file, then rotate ; max log size = 50 # User settings username map = /etc/samba/smbusers idmap uid = 10000-20000000 idmap gid = 10000-20000000 idmap backend = ad ; template primary group = <ad group> template shell = /sbin/nologin # Winbind Settings winbind separator = + winbind enum users = Yes winbind enum groups = Yes winbind netsted groups = Yes winbind nested groups = Yes winbind cache time = 10 winbind use default domain = Yes #Other Globals unix charset = LOCALE server string = <SERVERNAME> load printers = no printing = cups cups options = raw ; printcap name = /etc/printcap #obtain list of printers automatically on SystemV ; printcap name = lpstat ; printing = cups squid.conf auth_param ntlm program /usr/bin/ntlm_auth --require-membership-of=<DOMAINNAME>\\INTERNETZ --helper-protocol=squid-2.5-ntlmssp auth_param ntlm children 10 auth_param basic program /usr/lib/squid3/squid_ldap_auth -R -b "dc=<dcname>,dc=local" -D "cn=administrator,cn=Users,dc=<domainname>,dc=local" -w "******" -f sAMAccountName=%s -h 192.168.15.19:3268 auth_param basic realm "Proxy Authentifizierung. Bitte geben Sie Ihren Benutzername und Ihr Passwort ein!" #means insert you PW in an other language - # external_acl_type InetGroup %LOGIN /usr/lib/squid3/squid_ldap_group -R -b "dc=<domainname>,dc=local" -D "cn=administrator,cn=Users,dc=<domainname>,dc=local" -w "******" -f "(&(objectclass=person)(sAMAccountName=%v) (memberof=cn=%a,cn=internetz,dc=<domainname>,dc=local))" -h 192.168.15.19:3268 auth_param negotiate program /usr/lib/squid3/squid_kerb_auth -d auth_param negotiate children 10 auth_param negotiate keep_alive on acl localnet proxy_auth REQUIRED acl InetAccess external InetGroup Internetz http_access allow InetAccess http_access deny all acl auth proxy_auth REQUIRED http_access allow auth and a very suspicious is that by adding the proxy server to the Domain i see 2 new entries in the PC one with the original computer-name leopoldine and one with leopoldine CNF:f8efa4c4-ff0e-4217-939d-f1523b43464d ?!? I tried a lot, really... but i stuck on this problem... i actually i even reinstalled all dependent programs and reconfigured them from default. Group exists and has me in it. Firefox running on the old proxy and i use IE for testing the new one. But i'll get all the time Access-Denited and to be honest i'm quite a beginner, so please don't be to prude. I'll interested in improving, i'll get the information we need to fix this but i started working 2 month ago and got only 1 1/2 year's training and not a single sec. in linux ;)

Read the article

Best Amazon S3 File Manager Utility?

- by mmacaulay

Ever since I started using Amazon's S3 service I've been struggling to find a good solution for simple file management, without having to write my own app to browse my buckets, upload and delete files, etc. The best I've found so far to do this is S3Fox. But it's far from perfect, it has problems deleting files and folders. Comments on the Firefox plugin page indicate I'm not the only person with this problem and the developer does not respond to emails. I've looked around briefly, but couldn't find anything that looked any better than S3Fox. Please tell me there's a better way! Edit (07/26/2009): The Firefox S3Fox extension seems to be getting love from its developer again, the problems I was having before have gone away, and I'm using it on a regular basis now with no problems!

Read the article

A decent S3 bucket manager for Ubuntu

- by Luke

I'm looking for a decent S3 bucket manager for Ubuntu (Gnome). I prefer it to integrate with Nautilus so it will look like just any other drive (a la WebDAV) but so far I haven't been able to find anything that I'd like to use on a daily basis. What bucket managers do you use for Ubuntu or what bucket manager would you recommend? UPDATE: S3FS seems to be what I'd really want to use since it lets me integrate my buckets directly into my file-system. However, when trying S3FS I do not get the impression that it's ready for prime time. I'm stunned by the fact that there are no decent bucket managers out there for Ubuntu/Gnome, guess I have to build it myself...

Read the article

Heaps of Trouble?

- by Paul White NZ

If you’re not already a regular reader of Brad Schulz’s blog, you’re missing out on some great material. In his latest entry, he is tasked with optimizing a query run against tables that have no indexes at all. The problem is, predictably, that performance is not very good. The catch is that we are not allowed to create any indexes (or even new statistics) as part of our optimization efforts. In this post, I’m going to look at the problem from a slightly different angle, and present an alternative solution to the one Brad found. Inevitably, there’s going to be some overlap between our entries, and while you don’t necessarily need to read Brad’s post before this one, I do strongly recommend that you read it at some stage; he covers some important points that I won’t cover again here. The Example We’ll use data from the AdventureWorks database, copied to temporary unindexed tables. A script to create these structures is shown below: CREATE TABLE #Custs ( CustomerID INTEGER NOT NULL, TerritoryID INTEGER NULL, CustomerType NCHAR(1) COLLATE SQL_Latin1_General_CP1_CI_AI NOT NULL, ); GO CREATE TABLE #Prods ( ProductMainID INTEGER NOT NULL, ProductSubID INTEGER NOT NULL, ProductSubSubID INTEGER NOT NULL, Name NVARCHAR(50) COLLATE SQL_Latin1_General_CP1_CI_AI NOT NULL, ); GO CREATE TABLE #OrdHeader ( SalesOrderID INTEGER NOT NULL, OrderDate DATETIME NOT NULL, SalesOrderNumber NVARCHAR(25) COLLATE SQL_Latin1_General_CP1_CI_AI NOT NULL, CustomerID INTEGER NOT NULL, ); GO CREATE TABLE #OrdDetail ( SalesOrderID INTEGER NOT NULL, OrderQty SMALLINT NOT NULL, LineTotal NUMERIC(38,6) NOT NULL, ProductMainID INTEGER NOT NULL, ProductSubID INTEGER NOT NULL, ProductSubSubID INTEGER NOT NULL, ); GO INSERT #Custs ( CustomerID, TerritoryID, CustomerType ) SELECT C.CustomerID, C.TerritoryID, C.CustomerType FROM AdventureWorks.Sales.Customer C WITH (TABLOCK); GO INSERT #Prods ( ProductMainID, ProductSubID, ProductSubSubID, Name ) SELECT P.ProductID, P.ProductID, P.ProductID, P.Name FROM AdventureWorks.Production.Product P WITH (TABLOCK); GO INSERT #OrdHeader ( SalesOrderID, OrderDate, SalesOrderNumber, CustomerID ) SELECT H.SalesOrderID, H.OrderDate, H.SalesOrderNumber, H.CustomerID FROM AdventureWorks.Sales.SalesOrderHeader H WITH (TABLOCK); GO INSERT #OrdDetail ( SalesOrderID, OrderQty, LineTotal, ProductMainID, ProductSubID, ProductSubSubID ) SELECT D.SalesOrderID, D.OrderQty, D.LineTotal, D.ProductID, D.ProductID, D.ProductID FROM AdventureWorks.Sales.SalesOrderDetail D WITH (TABLOCK); The query itself is a simple join of the four tables: SELECT P.ProductMainID AS PID, P.Name, D.OrderQty, H.SalesOrderNumber, H.OrderDate, C.TerritoryID FROM #Prods P JOIN #OrdDetail D ON P.ProductMainID = D.ProductMainID AND P.ProductSubID = D.ProductSubID AND P.ProductSubSubID = D.ProductSubSubID JOIN #OrdHeader H ON D.SalesOrderID = H.SalesOrderID JOIN #Custs C ON H.CustomerID = C.CustomerID ORDER BY P.ProductMainID ASC OPTION (RECOMPILE, MAXDOP 1); Remember that these tables have no indexes at all, and only the single-column sampled statistics SQL Server automatically creates (assuming default settings). The estimated query plan produced for the test query looks like this (click to enlarge): The Problem The problem here is one of cardinality estimation – the number of rows SQL Server expects to find at each step of the plan. The lack of indexes and useful statistical information means that SQL Server does not have the information it needs to make a good estimate. Every join in the plan shown above estimates that it will produce just a single row as output. Brad covers the factors that lead to the low estimates in his post. In reality, the join between the #Prods and #OrdDetail tables will produce 121,317 rows. It should not surprise you that this has rather dire consequences for the remainder of the query plan. In particular, it makes a nonsense of the optimizer’s decision to use Nested Loops to join to the two remaining tables. Instead of scanning the #OrdHeader and #Custs tables once (as it expected), it has to perform 121,317 full scans of each. The query takes somewhere in the region of twenty minutes to run to completion on my development machine. A Solution At this point, you may be thinking the same thing I was: if we really are stuck with no indexes, the best we can do is to use hash joins everywhere. We can force the exclusive use of hash joins in several ways, the two most common being join and query hints. A join hint means writing the query using the INNER HASH JOIN syntax; using a query hint involves adding OPTION (HASH JOIN) at the bottom of the query. The difference is that using join hints also forces the order of the join, whereas the query hint gives the optimizer freedom to reorder the joins at its discretion. Adding the OPTION (HASH JOIN) hint results in this estimated plan: That produces the correct output in around seven seconds, which is quite an improvement! As a purely practical matter, and given the rigid rules of the environment we find ourselves in, we might leave things there. (We can improve the hashing solution a bit – I’ll come back to that later on). Faster Nested Loops It might surprise you to hear that we can beat the performance of the hash join solution shown above using nested loops joins exclusively, and without breaking the rules we have been set. The key to this part is to realize that a condition like (A = B) can be expressed as (A <= B) AND (A >= B). Armed with this tremendous new insight, we can rewrite the join predicates like so: SELECT P.ProductMainID AS PID, P.Name, D.OrderQty, H.SalesOrderNumber, H.OrderDate, C.TerritoryID FROM #OrdDetail D JOIN #OrdHeader H ON D.SalesOrderID >= H.SalesOrderID AND D.SalesOrderID <= H.SalesOrderID JOIN #Custs C ON H.CustomerID >= C.CustomerID AND H.CustomerID <= C.CustomerID JOIN #Prods P ON P.ProductMainID >= D.ProductMainID AND P.ProductMainID <= D.ProductMainID AND P.ProductSubID = D.ProductSubID AND P.ProductSubSubID = D.ProductSubSubID ORDER BY D.ProductMainID OPTION (RECOMPILE, LOOP JOIN, MAXDOP 1, FORCE ORDER); I’ve also added LOOP JOIN and FORCE ORDER query hints to ensure that only nested loops joins are used, and that the tables are joined in the order they appear. The new estimated execution plan is: This new query runs in under 2 seconds. Why Is It Faster? The main reason for the improvement is the appearance of the eager Index Spools, which are also known as index-on-the-fly spools. If you read my Inside The Optimiser series you might be interested to know that the rule responsible is called JoinToIndexOnTheFly. An eager index spool consumes all rows from the table it sits above, and builds a index suitable for the join to seek on. Taking the index spool above the #Custs table as an example, it reads all the CustomerID and TerritoryID values with a single scan of the table, and builds an index keyed on CustomerID. The term ‘eager’ means that the spool consumes all of its input rows when it starts up. The index is built in a work table in tempdb, has no associated statistics, and only exists until the query finishes executing. The result is that each unindexed table is only scanned once, and just for the columns necessary to build the temporary index. From that point on, every execution of the inner side of the join is answered by a seek on the temporary index – not the base table. A second optimization is that the sort on ProductMainID (required by the ORDER BY clause) is performed early, on just the rows coming from the #OrdDetail table. The optimizer has a good estimate for the number of rows it needs to sort at that stage – it is just the cardinality of the table itself. The accuracy of the estimate there is important because it helps determine the memory grant given to the sort operation. Nested loops join preserves the order of rows on its outer input, so sorting early is safe. (Hash joins do not preserve order in this way, of course). The extra lazy spool on the #Prods branch is a further optimization that avoids executing the seek on the temporary index if the value being joined (the ‘outer reference’) hasn’t changed from the last row received on the outer input. It takes advantage of the fact that rows are still sorted on ProductMainID, so if duplicates exist, they will arrive at the join operator one after the other. The optimizer is quite conservative about introducing index spools into a plan, because creating and dropping a temporary index is a relatively expensive operation. It’s presence in a plan is often an indication that a useful index is missing. I want to stress that I rewrote the query in this way primarily as an educational exercise – I can’t imagine having to do something so horrible to a production system. Improving the Hash Join I promised I would return to the solution that uses hash joins. You might be puzzled that SQL Server can create three new indexes (and perform all those nested loops iterations) faster than it can perform three hash joins. The answer, again, is down to the poor information available to the optimizer. Let’s look at the hash join plan again: Two of the hash joins have single-row estimates on their build inputs. SQL Server fixes the amount of memory available for the hash table based on this cardinality estimate, so at run time the hash join very quickly runs out of memory. This results in the join spilling hash buckets to disk, and any rows from the probe input that hash to the spilled buckets also get written to disk. The join process then continues, and may again run out of memory. This is a recursive process, which may eventually result in SQL Server resorting to a bailout join algorithm, which is guaranteed to complete eventually, but may be very slow. The data sizes in the example tables are not large enough to force a hash bailout, but it does result in multiple levels of hash recursion. You can see this for yourself by tracing the Hash Warning event using the Profiler tool. The final sort in the plan also suffers from a similar problem: it receives very little memory and has to perform multiple sort passes, saving intermediate runs to disk (the Sort Warnings Profiler event can be used to confirm this). Notice also that because hash joins don’t preserve sort order, the sort cannot be pushed down the plan toward the #OrdDetail table, as in the nested loops plan. Ok, so now we understand the problems, what can we do to fix it? We can address the hash spilling by forcing a different order for the joins: SELECT P.ProductMainID AS PID, P.Name, D.OrderQty, H.SalesOrderNumber, H.OrderDate, C.TerritoryID FROM #Prods P JOIN #Custs C JOIN #OrdHeader H ON H.CustomerID = C.CustomerID JOIN #OrdDetail D ON D.SalesOrderID = H.SalesOrderID ON P.ProductMainID = D.ProductMainID AND P.ProductSubID = D.ProductSubID AND P.ProductSubSubID = D.ProductSubSubID ORDER BY D.ProductMainID OPTION (MAXDOP 1, HASH JOIN, FORCE ORDER); With this plan, each of the inputs to the hash joins has a good estimate, and no hash recursion occurs. The final sort still suffers from the one-row estimate problem, and we get a single-pass sort warning as it writes rows to disk. Even so, the query runs to completion in three or four seconds. That’s around half the time of the previous hashing solution, but still not as fast as the nested loops trickery. Final Thoughts SQL Server’s optimizer makes cost-based decisions, so it is vital to provide it with accurate information. We can’t really blame the performance problems highlighted here on anything other than the decision to use completely unindexed tables, and not to allow the creation of additional statistics. I should probably stress that the nested loops solution shown above is not one I would normally contemplate in the real world. It’s there primarily for its educational and entertainment value. I might perhaps use it to demonstrate to the sceptical that SQL Server itself is crying out for an index. Be sure to read Brad’s original post for more details. My grateful thanks to him for granting permission to reuse some of his material. Paul White Email: [email protected] Twitter: @PaulWhiteNZ

Read the article

How to start a s3ql script automatically on boot?

- by ks78

I've been experimenting with s3ql on Ubuntu 10.04, using it to mount Amazon S3 buckets. However, I'd really like it to mount them automatically. Does anyone know how to do that? I've been working on a script, which works when its run from from the commandline, but for some reason I can't get it to run automatically on boot. Does anyone have any ideas? Here's my script: #! /bin/sh # /etc/init.d/s3ql # ### BEGIN INIT INFO # Provides: s3ql # Required-Start: $remote_fs $syslog # Required-Stop: $remote_fs $syslog # Default-Start: 2 3 4 5 # Default-Stop: 0 1 6 # Short-Description: Start daemon at boot time # Description: Enable service provided by daemon. ### END INIT INFO case "$1" in start) # Redirect stdout and stderr into the system log DIR=$(mktemp -d) mkfifo "$DIR/LOG_FIFO" logger -t s3ql -p local0.info < "$DIR/LOG_FIFO" & exec > "$DIR/LOG_FIFO" exec 2>&1 rm -rf "$DIR" modprobe fuse fsck.s3ql --batch s3://mybucket exec mount.s3ql --allow-other s3://mybucket /mnt/s3fs ;; stop) umount.s3ql /mnt/s3fs ;; *) echo "Usage: /etc/init.d/s3ql{start|stop}" exit 1 ;; esac exit 0

Read the article

Distribute Sort Sample Service

- by kaleidoscope

How it works? Using the front-end of the service, a user can specify a size in MB for the input data set to sort. Algorithm CreateAndSplit The CreateAndSplit task generates the input data and stores them as 10 blobs in the utility storage. The URLs to these blobs are packaged as Separate work items and written to the queue. · Separate The Separate task reads the blobs with the random numbers created in the CreateAndSplit task and places the random numbers into buckets. The interval of the numbers that go into one bucket is chosen so that the expected amount of numbers (assuming a uniform distribution of the numbers in the original data set) is around 100 kB. Each bucket is represented as a blob container in utility storage. Whenever there are 10 blobs in one bucket (i.e., the placement in this bucket is complete because we had 10 original splits), the separate task will generate a new Sort task and write the task into the queue. · Sort The Sort task merges all blobs in a single bucket and sorts them using a standard sort algorithm. The result is stored as a blob in utility storage. · Concat The concat task merges the results of all Sort tasks into a single blob. This blob can be downloaded as a text file using this Web page. As the resulting file is presented in text format, the size of the file is likely to be larger than the specified input file. Anish

Read the article

How does Google Analytics aggregate the Count of Visits (Frequency & Recency Report)?

- by Brian Dant

Here's my simple understanding of Count of Visits: Each person that comes to my site gets one "count" for each visit. They are put into a bucket of people with the same number of total counts -- if you visit twice, you are in the two bucket, if you visit six times, you are in the six bucket. From there, a report (Frequency & Recency) makes a line for each bucket and reaches into the bucket and totals the number of people in that bucket, putting that total in the second column. My Question: Will a two month report automatically put someone into two buckets, and put them on two separate lines in the Count of Visits table? This explaination makes it seem like a two-month long report will put the same person into a bucket twice, one bucket for each month. The two-month report will then show that person's visits on two different lines, instead of aggregating them. Example for Clarification: Bob comes to my site three times in January and seven times in February. I run a report for Jan 1 -- Feb 28. Will Bob be on both the Three Count line and the Seven Count line, or will he be on the Ten Count line?

Read the article

Hash Algorithm Randomness Visualization

- by clstroud

I'm curious if anyone here has any idea how the images were generated as shown in this response: Which hashing algorithm is best for uniqueness and speed? Ian posted a very well-received response but I can't seem to understand how he went about making the images. I hate to make a new question dedicated to this, but I can't find any means to ask him more directly. On the other hand, perhaps someone has an alternative perspective. The best I can personally come up with would be to have it almost like a bar graph, which would illustrate how evenly the buckets of the hash table are being generated. I have a working Cocoa program that does this, but it can't generate anything like what he showed there. So the question is two fold I suppose: A) How does one truly interpret the data he shows? Is it more than "less whitespace = better"? B) How does one generate such an image based on some set of inputs, a hash, and an index? Perhaps I'm misunderstanding entirely, but I really would like to know more about this particular visualization technique. Or maybe I'm mis-applying this to hash tables rather than just hashes in general, but in that case I don't know how it would be "bounded" for the image.

Read the article

Data Structure Behind Amazon S3s Keys (Filtering Data Structure)

- by dimo414

I'd like to implement a data structure similar to the lookup functionality of Amazon S3. For those of you who don't know what I'm taking about, Amazon S3 stores all files at the root, but allows you to look up groups of files by common prefixes in their names, therefore replicating the power of a directory tree without the complexity of it. The catch is, both lookup and filter operations are O(1) (or close enough that even on very large buckets - S3's disk equivalents - both operations might as well be O(1))). So in short, I'm looking for a data structure that functions like a hash map, with the added benefit of efficient (at the very least not O(n)) filtering. The best I can come up with is extending HashMap so that it also contains a (sorted) list of contents, and doing a binary search for the range that matches the prefix, and returning that set. This seems slow to me, but I can't think of any other way to do it. Does anyone know either how Amazon does it, or a better way to implement this data structure?

Read the article

PHP/MySQL database connection priority?

- by Josh

Hello, I have a production database where usage statistics for a service we're working on. I use php to periodically roll up different resolutions (day, week, month, year) of interesting statistics in buckets dictated by the resolution. The php application I've written "completes" its data when its run, such that it will calculate all the rolled-up statistics for the resolutions and periods since it was last run. This is useful if we want to turn this off to debug database performance issues, because I can turn it back on and have it complete its data set. The problem I have, is the calculations are fairly intensive and drive the QPS of the production database server up. Is there a way to set a "priority" on a particular database connection so that it will only use "off-cycles" to do these calculations? Maybe a proper response would be to replicate the tables I'm working on into a different stats database, but, unfortunately I don't have the resources in place to attempt such a thing (yet). Thanks in advance for any help, Josh

Read the article

How does .NET determine the week in a TimeZoneInfo.TransitionTime ?

- by Travis Brooks

Greetings I'm trying to do some DateTime math for various time zones and I wanted to take daylight savings into account. Lets say I have a TimeZoneInfo and i've determined the appropriate AdjustmentRule for a given DateTime. Lets also say the particular TimeZoneInfo i'm dealing with is specified as rule.DaylightTransitionStart.IsFixedDateRule == false, so I need to figure out if the given DateTime falls within the start/end TransitionTime.Week values. This is where I'm getting confused, what is .NET considering as a "week"? My first thought was it probably used something like DayOfWeek thisMarksWeekBoundaries = Thread.CurrentThread.CurrentUICulture.DateTimeFormat.FirstDayOfWeek; and went through the calendar assigning days to week, incrementing week every time it crossed a boundary. But, if I do this for May 2010 there are 6 week boundary buckets, and the max valid value for TransitionTime.Week is 5 so this can't be right. Whats the right way to slice up May 2010?

Read the article

Amazon S3 security credentials per bucket

- by slythic

Hi all, I was wondering if it was possible to generate security credentials per individual Amazon S3 bucket. I am working with a developer and would like to grant him access only to the bucket we are working with. It's not a trust issue, it's more a concern that he'll delete the wrong bucket or its contents. For example: If we were working on an application that used a bucket called test-application I could generate the credentials for just that one bucket. These credentials would not allow access to other buckets in my account. Is this possible? Thanks, Tony

Read the article

Amazon S3 collisions with heroku and paperclip

- by poseid

I have an app on my localhost for development and an app for testing on heroku. Image upload with localhost and paperclip always works. However, doing the same experiment with image upload on my heroku app, the app hangs... and the upload seems to be going on forever. I suspect that there is a collision going on. What is needed to get but uploads working? Or do I need to use different buckets for each environment?

Read the article

Oracle (PL/SQL): Is UPDATE RETURNING concurrent?

- by Jaap

I'm using table with a counter to ensure unique id's on a child element. I know it is usually better to use a sequence, but I can't use it because I have a lot of counters (a customer can create a couple of buckets and each of them needs to have their own counter, they have to start with 1 (it's a requirement, my customer needs "human readable" keys). I'm creating records (let's call them items) that have a prikey (bucket_id, num = counter). I need to guarantee that the bucket_id / num combination is unique (so using a sequence as prikey won't fix my problem). The creation of rows doesn't happen in pl/sql, so I need to claim the number (btw: it's not against the requirements to have gaps). My solution was: UPDATE bucket SET counter = counter + 1 WHERE id = param_id RETURNING counter INTO num_forprikey; PL/SQL returns var_num_forprikey so the item record can be created. Question: Will I always get unique num_forprikey even if the user concurrently asks for new items in a bucket?

Read the article

How to securely stream video from amazon S3

- by JP.

I have couple of copyright videos available on my S3 buckets. I want to stream them on my website, but at the same time. I don't want the users to rip the video from the video player. I tried to google about it but still i am not confident on this, coz i do not know the intricacies of options available like Server Side encryption None/ AES-256 2) A very interesting option is under Metadata tab - It shows couple of keys & Values. How can i use them to secure my video content? 3) Add more meta data and related options?

Read the article

Counting Values in R Vector

- by GTyler

I have a large vector of percentages (0-100) and I am trying to count how many of them are in specific 20% buckets (<20, 20-40, 40-60,60-80,80-100). The vector has length 129605 and there are no NA values. Here's my code: x<-c(0,0,0,0,0) for(i in 1: length(mail_return)) { if (mail_return[i]<=20) { x[1] = x[1] + 1 } if (mail_return[i]>20 && mail_return[i]<=40) { x[2] = x[2] + 1 } if (mail_return[i]>40 && mail_return[i]<=60) { x[3] = x[3] + 1 } if (mail_return[i]>60 && mail_return[i]<=80) { x[4] = x[4] + 1 } else { x[5] = x[5] + 1 } } But sum(x) is giving me length 133171. Shouldn't it be the length of the vector, 129605? What's wrong?

Read the article

map subdomain to another subdomain via cname

- by Stephen

Question: I need to get DNS configured to point a subdomain from one domain (which I will generally not be controlling) to another subdomain on a different domain name. Testing this process using a simple CNAME entry keeps pointing to the primary domain and not the subdomain where it should be going. This is the scenario; (newdomain.com is in my control) cdn.xyz.com should display content from this subdomain subdomain.newdomain.com It is instead displaying content from newdomain.com (not the subdomain sub domain) cdn.xyz.com/page.htm displays content from newdomain.com/page.htm although what I need is it to display content from subdomain.newdomain.com/page.htm Other Background: setup is between two different servers with different IP ranges although DNS cluster is on between all servers the newdomain.com is set up with its own unique IP (which is on the A records for the subdomains, the subdomains work as expected/normal) the DNS entry is correct (cdn CNAME subdomain.newdomain.com.) ie the end period is included a DNS lookup on the CNAME externally reports back as subdomain.newdomain.com. as the record Does anyone know what DNS entries I am missing to get this working correctly ? Note: I do not want to just put a redirect between domains as I need the content of subdomain.newdomain.com/content.html to be visible via the URL of cdn.xyz.com/content.html also I can just use some redirects on newdomain.com to achieve what I am after but would prefer to just get the DNS correct. EDIT Current DNS cdn CNAME subdomain.newdomain.com. || CNAME entry for domain1 subdomain A XXX.XXX.XXX.XXX || A record entry for working subdomain pointing to unique IP What should happen is that cdn.domain1.com - subdomain.newdomain.com What is happening is cdn.domain1.com - newdomain.com (ie. the root not the subdomain) EDIT 2 Actually if its easier I am trying to emulate a simple cloud setup like Rackspace Containers (which I assume is similar to Buckets on AWS). although it is not for cloud storage Where a container has a url reference of hd62321678d323.rackspace.com (in truth they are much longer) so I can use a CNAME record of: cdn CNAME hd62321678d323.rackspace.com. so that http://cdn.mydomain.com/myfile.jpg displays content from http://hd62321678d323.rackspace.com/myfile.jpg

Read the article

FreeBSD Traffic Shaping

- by alexus

Hi I'm trying to do traffic shaping with FreeBSD, here are my rules su-3.2# ipfw show | grep pipe 08380 1514852 125523804 pipe 1 tcp from any to any dst-port 80 su-3.2# ipfw pipe 1 show 00001: 2.000 Mbit/s 0 ms 50 sl. 1 queues (1 buckets) droptail mask: 0x00 0x00000000/0x0000 - 0x00000000/0x0000 BKT Prot ___Source IP/port____ ____Dest. IP/port____ Tot_pkt/bytes Pkt/Byte Drp 0 tcp 64.237.55.83/60598 72.21.81.133/80 6520267 1204533020 0 0 1216 su-3.2# first of all why when I run ipfw pipe 1 show i get same source and destination ip, that doesnt seem like ever change yet total packets/bytes increasing and most important question, after donig all that I'm looking at my MRTG stats and I see i'm very well over 2Mbit/s limit. what am I doing wrong? here is config file flush pipe flush pipe 1 config bw 2Mbit/s add 100 allow ip from any to any via lo0 add 200 deny ip from any to 127.0.0.0/8 add 300 deny ip from 127.0.0.0/8 to any add 8380 pipe 1 tcp from any to any src-port www uid daemon add 8380 pipe 1 tcp from any to any dst-port www uid daemon add 65000 pass all from any to any

Read the article

How to start a s3ql script automatically on boot?

- by ks78

I've been experimenting with s3ql on Ubuntu 10.04, using it to mount Amazon S3 buckets. However, I'd really like it to mount them automatically. Does anyone know how to do that? I've been working on a script, which works when its run from from the commandline, but for some reason I can't get it to run automatically on boot. Does anyone have any ideas? Here's my script: #! /bin/sh # /etc/init.d/s3ql # ### BEGIN INIT INFO # Provides: s3ql # Required-Start: $remote_fs $syslog # Required-Stop: $remote_fs $syslog # Default-Start: 2 3 4 5 # Default-Stop: 0 1 6 # Short-Description: Start daemon at boot time # Description: Enable service provided by daemon. ### END INIT INFO case "$1" in start) # Redirect stdout and stderr into the system log DIR=$(mktemp -d) mkfifo "$DIR/LOG_FIFO" logger -t s3ql -p local0.info < "$DIR/LOG_FIFO" & exec > "$DIR/LOG_FIFO" exec 2>&1 rm -rf "$DIR" modprobe fuse fsck.s3ql --batch s3://mybucket exec mount.s3ql --allow-other s3://mybucket /mnt/s3fs ;; stop) umount.s3ql /mnt/s3fs ;; *) echo "Usage: /etc/init.d/s3ql{start|stop}" exit 1 ;; esac exit 0

Read the article

The Latest News About SAP

- by jmorourke

Like many professionals, I get a lot of my news from Google e-mail alerts that I’ve set up to keep track of key industry trends and competitive news. In the past few weeks, I’ve been getting a number of news alerts about SAP. Below are a few recent examples: Warm weather cuts short US maple sugaring season – by Toby Talbot, AP MILWAUKEE – Temperatures in Wisconsin had already hit the high 60s when Gretchen Grape and her family began tapping their 850 maple trees. They had waited for the state's ceremonial tapping to kick off the maple sugaring season. It was moved up five days, but that didn't make much difference. For Grape, the typically month-long season ended nine days later. The SAP had stopped flowing in a record-setting heat wave, and the 5-quart collection bags that in a good year fill in a day were still half-empty. Instead of their usual 300 gallons of syrup, her family had about 40. Maple syrup producers across the North have had their season cut short by unusually warm weather. While those with expensive, modern vacuum systems say they've been able to suck a decent amount of sap from their trees, producers like Grape, who still rely on traditional taps and buckets, have seen their year ruined. "It's frustrating," said the 69-year-old retiree from Holcombe, Wis. "You put in the same amount of work, equipment, investment, and then all of a sudden, boom, you have no SAP." Home & Garden: Too-Early Spring Means Sugaring Woes - by Georgeanne Davis for The Free Press Over this past weekend, forsythia and daffodils were blooming in the southern parts of the state as temperatures climbed to 85 degrees, and trees began budding out, putting an end to this year's maple syrup production even as the state celebrated Maine Maple Sunday. Maple sugaring needs cold nights and warm days to induce SAP flows. Once the trees begin budding, SAP can still flow, but the SAP is bitter and has an off taste. Many farmers and dairymen count on sugaring for extra income, so the abbreviated season is a real financial loss for them, akin to the shortened shrimping season's effect on Maine lobstermen. SAP season comes to a sugary Sunday finale – Kennebec Journal, March 26th, 2012 Rebecca Manthey stood out in the rain at the entrance of Old Fort Western keeping watch over a cast iron kettle of boiling SAP hooked to a tripod over a wood fire. Manthey and the rest of the Old Fort Western staff -- decked out in 18th-century attire -- joined sugar houses across the state in observance of Maine Maple Sunday. The annual event is sponsored by the Department of Agriculture and the Maine Maple Producers Association. She said the rain hadn't kept people from coming to enjoy all the events at the fort surrounding the production of Maple syrup. "In the 18th century, you would be boiling SAP in the woods, so I would be in the woods," Manthey explained to the families who circled around her. "People spent weeks and weeks in the woods. You don't want to cook it to fast or it would burn. When it looks like the right consistency then you send it (into the kitchen) to be made into sugar." Manthey said she enjoyed portraying an 18th-century woman, even in the rain, which didn't seem to bother visitors either. There was a steady stream of families touring the fort and enjoying the maple syrup demonstrations. I hope you enjoy these updates on SAP – Happy April Fool’s Day!

Read the article

Working for free?

- by Jonny

I came across this article Work for Free that got me thinking. The goal of every employer is to gain more value from workers than the firm pays out in wages; otherwise, there is no growth, no advance, and no advantage for the employer. Conversely, the goal of every employee should be to contribute more to the firm than he or she receives in wages, and thereby provide a solid rationale for receiving raises and advancement in the firm. I don't need to tell you that the refusenik didn't last long in this job. In contrast, here is a story from last week. My phone rang. It was the employment division of a major university. The man on the phone was inquiring about the performance of a person who did some site work on Mises.org last year. I was able to tell him about a remarkable young man who swung into action during a crisis, and how he worked three 19-hour days, three days in a row, how he learned new software with diligence, how he kept his cool, how he navigated his way with grace and expertise amidst some 80 different third-party plug-ins and databases, how he saw his way around the inevitable problems, how he assumed responsibility for the results, and much more. What I didn't tell the interviewer was that this person did all this without asking for any payment. Did that fact influence my report on his performance? I'm not entirely sure, but the interviewer probably sensed in my voice my sense of awe toward what this person had done for the Mises Institute. The interviewer told me that he had written down 15 different questions to ask me but that I had answered them all already in the course of my monologue, and that he was thrilled to hear all these specifics. The person was offered the job. He had done a very wise thing; he had earned a devotee for life. The harder the economic times, the more employers need to know what they are getting when they hire someone. The job applications pour in by the buckets, all padded with degrees and made to look as impressive as possible. It's all just paper. What matters today is what a person can do for a firm. The resume becomes pro forma but not decisive under these conditions. But for a former boss or manager to rave about you to a potential employer? That's worth everything. What do you think? Has anyone here worked for free? If so, has it benefited you in any way? Why should(nt) you work for free (presuming you have the money from other means to keep you going)? Can you share your experience? Me, I am taking a year out of college and haven't gotten a degree yet so that's probably why most of my job applications are getting ignored. So im thinking about working for free for the experience?

Read the article

How to group strings by prefix

- by namenlos

I am writing a Winform UI in which the user must select a single customer. (For reasons beyond my control I am limited to a UI that uses dropdown lists, text fields, checkboxes, radiobuttons only -i.e. no fancy special UI controls) The situation There are a lot of customers (a thousand for example) If i put all the customers in a single dropdown there's no way it will be easy for a customer to even see all the customers. Also the it will take too long to retireve all the customers from the DB to populate the dropdown My thought is to have two combo box, the first lists groups of the customers by their last name something like a phone book "Aa-Ac", "Ad-Ade", "Adf-B", when selecting the first combo box, it scope the second one to a managable set customer names (no more than for example 40 names) The question I need a reasonable way of grouping their names such that it will be clear to customer which group contains the name. I.e. given a group of names I need to bucketize then int "Aa-Ac". Comments I don't need to solve the general problem of an immense number of names - we know based on our data that 1000 names is the max our users will encounter. If there are other techniques please do share, but I am interested specifically in an answer to my specific question around how to determine the buckets ("Aa-Ac", etc.)

Search Results

Search found 100 results on 4 pages for 'buckets'.

Page 3/4 | < Previous Page | 1 2 3 4 | Next Page >

- by Phenom

- by DKNUCKLES

- by ModuleC

- by derty

- by mmacaulay

- by Luke

- by Paul White NZ

- by ks78

- by kaleidoscope

- by Brian Dant

- by clstroud

- by dimo414

- by Josh

- by Travis Brooks

- by slythic

- by poseid

- by Jaap

- by JP.

- by GTyler

- by Stephen

- by alexus

- by ks78

- by jmorourke

- by Jonny

- by namenlos

< Previous Page | 1 2 3 4 | Next Page >