Search Results

Search found 14 results on 1 pages for 'voldemort'.

Page 1/1 | 1 

  • Correct usage of Voldemort as key-value pair?

    - by zengr
    Hello, I am trying to understand, how can Voldermort be used? Say, I have this scenario: Since, Voldemort is a key-value pair. I need to fetch a value (say some text) on the basis of 3 parameters. So, what will be the key in this case? I cannot use 3 keys for 1 value right, but that value should be search able on the basis of those 3 parameters. Am I making sense? Thanks EDIT1 eg: A blog system. A user posts a blog: User's data stored: Name, Age and Sex The blog content (text) is stored. Now, I need to use Voldemort here, if a user searches from the front end for all the blog posts by Sex: Male Then, my code should query voldemort and return all the "blog content (text)" which have Sex as Male.

    Read the article

  • How to create a voldemort store?

    - by Pankaj
    I am trying to understand the Voldemort java API. I am new to non relational databases, so as I understand, Voldemort's store can be compared to a table in relational model. I saw the following code in the documentation. String bootstrapUrl = "tcp://localhost:6666"; StoreClientFactory factory = new SocketStoreClientFactory(new ClientConfig().setBootstrapUrls(bootstrapUrl)); // create a client that executes operations on a single store StoreClient client = factory.getStoreClient("test"); Here, we got a Store client based on the an existing store (test). How can I actually create a store for Voldemort through java?

    Read the article

  • persist image or video into voldemort

    - by qkrsppopcmpt
    I am working on my web service, and required to persist some image (jpg whatever) and video(wmv) into memmory. Just want to use single_node_cluster to feel voldemort. Can anybody give me a hint of the configuration and sample code of voldemort? I mean how to configure the value type in stores.xml? protobuf? java-serialization? Any sample or link would be helpful. Thanks

    Read the article

  • Excluding some classes from the cobertura report doesn't work

    - by user357480
    I tried to exclude some classes from cobertura as specified in this site <cobertura-instrument todir="${voldemort.instrumented.dir}" datafile="${cobertura.instrument.file}"> <classpath refid="tools-classpath" /> <ignore regex=".*\.xsd" /> <fileset dir="${voldemort.dist.dir}/classes"> <include name="**/*.class" /> <exclude name="**/client/protocol/pb/*.class"/> <exclude name="**/server/http/*.class"/> </fileset> </cobertura-instrument> but that doesn't work help me out, pleaseeeeeeeeeeee

    Read the article

  • Is Berkeley DB a NoSQL solution?

    - by Gregory Burd
    Berkeley DB is a library. To use it to store data you must link the library into your application. You can use most programming languages to access the API, the calls across these APIs generally mimic the Berkeley DB C-API which makes perfect sense because Berkeley DB is written in C. The inspiration for Berkeley DB was the DBM library, a part of the earliest versions of UNIX written by AT&T's Ken Thompson in 1979. DBM was a simple key/value hashtable-based storage library. In the early 1990s as BSD UNIX was transitioning from version 4.3 to 4.4 and retrofitting commercial code owned by AT&T with unencumbered code, it was the future founders of Sleepycat Software who wrote libdb (aka Berkeley DB) as the replacement for DBM. The problem it addressed was fast, reliable local key/value storage. At that time databases almost always lived on a single node, even the most sophisticated databases only had simple fail-over two node solutions. If you had a lot of data to store you would choose between the few commercial RDBMS solutions or to write your own custom solution. Berkeley DB took the headache out of the custom approach. These basic market forces inspired other DBM implementations. There was the "New DBM" (ndbm) and the "GNU DBM" (GDBM) and a few others, but the theme was the same. Even today TokyoCabinet calls itself "a modern implementation of DBM" mimicking, and improving on, something first created over thirty years ago. In the mid-1990s, DBM was the name for what you needed if you were looking for fast, reliable local storage. Fast forward to today. What's changed? Systems are connected over fast, very reliable networks. Disks are cheep, fast, and capable of storing huge amounts of data. CPUs continued to follow Moore's Law, processing power that filled a room in 1990 now fits in your pocket. PCs, servers, and other computers proliferated both in business and the personal markets. In addition to the new hardware entire markets, social systems, and new modes of interpersonal communication moved onto the web and started evolving rapidly. These changes cause a massive explosion of data and a need to analyze and understand that data. Taken together this resulted in an entirely different landscape for database storage, new solutions were needed. A number of novel solutions stepped up and eventually a category called NoSQL emerged. The new market forces inspired the CAP theorem and the heated debate of BASE vs. ACID. But in essence this was simply the market looking at what to trade off to meet these new demands. These new database systems shared many qualities in common. There were designed to address massive amounts of data, millions of requests per second, and scale out across multiple systems. The first large-scale and successful solution was Dynamo, Amazon's distributed key/value database. Dynamo essentially took the next logical step and added a twist. Dynamo was to be the database of record, it would be distributed, data would be partitioned across many nodes, and it would tolerate failure by avoiding single points of failure. Amazon did this because they recognized that the majority of the dynamic content they provided to customers visiting their web store front didn't require the services of an RDBMS. The queries were simple, key/value look-ups or simple range queries with only a few queries that required more complex joins. They set about to use relational technology only in places where it was the best solution for the task, places like accounting and order fulfillment, but not in the myriad of other situations. The success of Dynamo, and it's design, inspired the next generation of Non-SQL, distributed database solutions including Cassandra, Riak and Voldemort. The problem their designers set out to solve was, "reliability at massive scale" so the first focal point was distributed database algorithms. Underneath Dynamo there is a local transactional database; either Berkeley DB, Berkeley DB Java Edition, MySQL or an in-memory key/value data structure. Dynamo was an evolution of local key/value storage onto networks. Cassandra, Riak, and Voldemort all faced similar design decisions and one, Voldemort, choose Berkeley DB Java Edition for it's node-local storage. Riak at first was entirely in-memory, but has recently added write-once, append-only log-based on-disk storage similar type of storage as Berkeley DB except that it is based on a hash table which must reside entirely in-memory rather than a btree which can live in-memory or on disk. Berkeley DB evolved too, we added high availability (HA) and a replication manager that makes it easy to setup replica groups. Berkeley DB's replication doesn't partitioned the data, every node keeps an entire copy of the database. For consistency, there is a single node where writes are committed first - a master - then those changes are delivered to the replica nodes as log records. Applications can choose to wait until all nodes are consistent, or fire and forget allowing Berkeley DB to eventually become consistent. Berkeley DB's HA scales-out quite well for read-intensive applications and also effectively eliminates the central point of failure by allowing replica nodes to be elected (using a PAXOS algorithm) to mastership if the master should fail. This implementation covers a wide variety of use cases. MemcacheDB is a server that implements the Memcache network protocol but uses Berkeley DB for storage and HA to replicate the cache state across all the nodes in the cache group. Google Accounts, the user authentication layer for all Google properties, was until recently running Berkeley DB HA. That scaled to a globally distributed system. That said, most NoSQL solutions try to partition (shard) data across nodes in the replication group and some allow writes as well as reads at any node, Berkeley DB HA does not. So, is Berkeley DB a "NoSQL" solution? Not really, but it certainly is a component of many of the existing NoSQL solutions out there. Forgetting all the noise about how NoSQL solutions are complex distributed databases when you boil them down to a single node you still have to store the data to some form of stable local storage. DBMs solved that problem a long time ago. NoSQL has more to do with the layers on top of the DBM; the distributed, sometimes-consistent, partitioned, scale-out storage that manage key/value or document sets and generally have some form of simple HTTP/REST-style network API. Does Berkeley DB do that? Not really. Is Berkeley DB a "NoSQL" solution today? Nope, but it's the most robust solution on which to build such a system. Re-inventing the node-local data storage isn't easy. A lot of people are starting to come to appreciate the sophisticated features found in Berkeley DB, even mimic them in some cases. Could Berkeley DB grow into a NoSQL solution? Absolutely. Our key/value API could be extended over the net using any of a number of existing network protocols such as memcache or HTTP/REST. We could adapt our node-local data partitioning out over replicated nodes. We even have a nice query language and cost-based query optimizer in our BDB XML product that we could reuse were we to build out a document-based NoSQL-style product. XML and JSON are not so different that we couldn't adapt one to work with the other interchangeably. Without too much effort we could add what's missing, we could jump into this No SQL market withing a single product development cycle. Why isn't Berkeley DB already a NoSQL solution? Why aren't we working on it? Why indeed...

    Read the article

  • Oracle's Director NoSQL Database Product Management talks with ODBMS.ORG

    - by thegreeneman
    I was pinged by one of my favorite database technology sites today, ODBMS.ORG - informing that Dave Segleau, the Director of Oracle NoSQL Database product management spent some time talking with their editor Roberto Zicari about the product.   Its a great interview and I highly recommend the read.  I think its important to understand the connectivity that Oracle NoSQL Database (ONDB) has with BerkeleyDB, as it says a lot about the maturity of ONDB as it relates to data integrity and reliability.  BerkeleyDB has been living the NoSQL life since the beginning of this transition embracing the right tool for the job approach to data management.  Several of the biggest names in NoSQL ( e.g. LinkedIn's Voldemort ) built their NoSQL scale-out solutions leveraging the robust BerkeleyDB storage engine under their distribution architectures.  Oracle commercializing the same via ONDB makes perfect sense given the demonstrated need for this category of technology.

    Read the article

  • Announcing Berkeley DB Java Edition Major Release

    - by Eric Jensen
    Berkeley DB Java Edition 5.0 was just released. There are a number of new features, enhancements, and options in there that our users have been asking for. Chief among them is a new class called DiskOrderedCursor, which greatly increases performance of systems using spinning platter magnetic hard drives. A number of users expressed interest in this feature, including Alex Feinberg of LinkedIn. Berkeley DB Java Edition is part of Project Voldemort, a distributed key/value database used by LinkedIn. There have been many other improvements and optimizations. Concurrency is significantly improved, as is the performance of update and delete operations. New and interesting methods include Environment.preload, which allows multiple databases to be preloaded simultaneously. New Cursor methods enable for more effective searching through the database. We continue to enhance Berkeley DB Java Edition’s High Availability as well. One new feature is the ability to open a replicated node read-only when the master is unavailable. This can allow critical systems to continue offering some functionality, even during a network or master node failure. There’s a lot more in release 5.0. I encourage you to take a look at the extensive changelog yourself. As always, you can download the new release and try it out here: http://www.oracle.com/technetwork/database/berkeleydb/downloads/index.html

    Read the article

  • How common is it to submit papers to journals or conferences outside of academia?

    - by Furry
    I worked in academia a few years, but more on the D-side of R&D. The race for papers never appealed to me and I'm a practical not theoretical type, but I do like reading papers on certain topics (e.g. Google Papers, NLP, FB papers, ...) a lot. How common is it that normally working developers submit papers to conferences or even journals? It seems to be somewhat common in certain companies (it's not common or encouraged in mine). Do journals or conferences even take papers by an academic nobody (BSc) under consideration? I ask, because I have a few rough ideas and I would just like to bring them into form, one way or the other. Bonus question: Is there a list of CS (in the widest sense) conferences/journals with short descriptions? PS (Four out of five researchers I met published quite some fluffy stuff for my taste. I am no expert, but those people told me sometimes themselves, that the implementation does not matter, just the idea and the presentation. I always wondered about that. I probably could write about ideas all day long (not instantly but with a bit of preparation), but the implementation and the practical part is the really hard part, that academia just does not like to be concerned with. Also many papers actually scream: I was written so the publication list of my author gets longer - which is a waste of time for everyone, and often a waste of tax money, too. When I think of CS-ish papers, I think of running implementations or actual data, like e.g. Google's Map Reduce, Serving Large-scale Batch Computed Data with Project Voldemort or the like.)

    Read the article

  • Hadoop, NOSQL, and the Relational Model

    - by Phil Factor
    (Guest Editorial for the IT Pro/SysAdmin Newsletter)Whereas Relational Databases fit the world of commerce like a glove, it is useless to pretend that they are a perfect fit for all human endeavours. Although, with SQL Server, we’ve made great strides with indexing text, in processing spatial data and processing markup, there is still a problem in dealing efficiently with large volumes of ephemeral semi-structured data. Key-value stores such as Cassandra, Project Voldemort, and Riak are of great value for ephemeral data, and seem of equal value as a data-feed that provides aggregations to an RDBMS. However, the Document databases such as MongoDB and CouchDB are ideal for semi-structured data for which no fixed schema exists; analytics and logging are obvious examples. NoSQL products, such as MongoDB, tackle the semi-structured data problem with panache. MongoDB is designed with a simple document-oriented data model that scales horizontally across multiple servers. It doesn’t impose a schema, and relies on the application to enforce the data structure. This is another take on the old ‘EAV’ problem (where you don’t know in advance all the attributes of a particular entity) It uses a clever replica set design that allows automatic failover, and uses journaling for data durability. It allows indexing and ad-hoc querying. However, for SQL Server users, the obvious choice for handling semi-structured data is Apache Hadoop. There will soon be an ODBC Driver for Apache Hive .and an Add-in for Excel. Additionally, there are now two Hadoop-based connectors for SQL Server; the Apache Hadoop connector for SQL Server 2008 R2, and the SQL Server Parallel Data Warehouse (PDW) connector. We can connect to Hadoop process the semi-structured data and then store it in SQL Server. For one steeped in the culture of Relational SQL Databases, I might be expected to throw up my hands in the air in a gesture of contempt for a technology that was, judging by the overblown journalism on the subject, about to make my own profession as archaic as the Saggar makers bottom knocker (a potter’s assistant who helped the saggar maker to make the bottom of the saggar by placing clay in a metal hoop and bashing it). However, on the contrary, I find that I'm delighted with the advances made by the NoSQL databases in the past few years. Having the flow of ideas from the NoSQL providers will knock any trace of complacency out of the providers of Relational Databases and inspire them into back-fitting some features, such as horizontal scaling, with sharding and automatic failover into SQL-based RDBMSs. It will do the breed a power of good to benefit from all this lateral thinking.

    Read the article

  • How can i export a folder of Firefox Bookmarks into a text file

    - by Memor-X
    I want to clean up my bookmark folders by getting rid of duplicated/links. i have created a program which will import 2 text files which contains a URL like this File 1: http://www.google/com http://anime.stackexchange.com/ https://www.fanfiction.net/guidelines/ https://www.fanfiction.net/anime/Magical-Girl-Lyrical-Nanoha/?&srt=1&g1=2&lan=1&r=103&s=2 File 2: http://scifi.stackexchange.com/ http://scifi.stackexchange.com/questions/56142/why-didnt-dumbledore-just-hunt-voldemort-down http://anime.stackexchange.com/ http://scifi.stackexchange.com/questions/5650/how-can-the-doctor-be-poisoned the program compares the 2 lists and creates a single master list with duplicate URLs removed. Now i have a few Backup bookmark folders in Firefox which on occasion i'll bookmark all tabs into a new folder with the date of the backup before i close off tabs or do reset my PC. each folder can have between 1000-2000 bookmarks, sometimes there are a bunch of pages which keep getting bookmarked, ie, i have ~50 pages on the Magical Girl Lyrical Nanoha Wiki on different spells, character and terminology which i commonly look back on. I'd like to know how i can export a bookmark folder so i have a list of URLs similar to what i use in my program

    Read the article

  • What scalability problems have you solved using a NoSQL data store?

    - by knorv
    NoSQL refers to non-relational data stores that break with the history of relational databases and ACID guarantees. Popular open source NoSQL data stores include: Cassandra (tabular, written in Java, used by Facebook, Twitter, Digg, Rackspace, Mahalo and Reddit) CouchDB (document, written in Erlang, used by Engine Yard and BBC) Dynomite (key-value, written in C++, used by Powerset) HBase (key-value, written in Java, used by Bing) Hypertable (tabular, written in C++, used by Baidu) Kai (key-value, written in Erlang) MemcacheDB (key-value, written in C, used by Reddit) MongoDB (document, written in C++, used by Sourceforge, Github, Electronic Arts and NY Times) Neo4j (graph, written in Java, used by Swedish Universities) Project Voldemort (key-value, written in Java, used by LinkedIn) Redis (key-value, written in C, used by Engine Yard, Github and Craigslist) Riak (key-value, written in Erlang, used by Comcast and Mochi Media) Ringo (key-value, written in Erlang, used by Nokia) Scalaris (key-value, written in Erlang, used by OnScale) ThruDB (document, written in C++, used by JunkDepot.com) Tokyo Cabinet/Tokyo Tyrant (key-value, written in C, used by Mixi.jp (Japanese social networking site)) I'd like to know about specific problems you - the SO reader - have solved using data stores and what NoSQL data store you used. Questions: What scalability problems have you used NoSQL data stores to solve? What NoSQL data store did you use? What database did you use before switching to a NoSQL data store? I'm looking for first-hand experiences, so please do not answer unless you have that.

    Read the article

  • Non-Relational Database Design

    - by Ian Varley
    I'm interested in hearing about design strategies you have used with non-relational "nosql" databases - that is, the (mostly new) class of data stores that don't use traditional relational design or SQL (such as Hypertable, CouchDB, SimpleDB, Google App Engine datastore, Voldemort, Cassandra, SQL Data Services, etc.). They're also often referred to as "key/value stores", and at base they act like giant distributed persistent hash tables. Specifically, I want to learn about the differences in conceptual data design with these new databases. What's easier, what's harder, what can't be done at all? Have you come up with alternate designs that work much better in the non-relational world? Have you hit your head against anything that seems impossible? Have you bridged the gap with any design patterns, e.g. to translate from one to the other? Do you even do explicit data models at all now (e.g. in UML) or have you chucked them entirely in favor of semi-structured / document-oriented data blobs? Do you miss any of the major extra services that RDBMSes provide, like relational integrity, arbitrarily complex transaction support, triggers, etc? I come from a SQL relational DB background, so normalization is in my blood. That said, I get the advantages of non-relational databases for simplicity and scaling, and my gut tells me that there has to be a richer overlap of design capabilities. What have you done? FYI, there have been StackOverflow discussions on similar topics here: the next generation of databases changing schemas to work with Google App Engine choosing a document-oriented database

    Read the article

  • Simplest distributed persistent key/value store that supports primary key range queries

    - by StaxMan
    I am looking for a properly distributed (i.e. not just sharded) and persisted (not bounded by available memory on single node, or cluster of nodes) key/value ("nosql") store that does support range queries by primary key. So far closest such system is Cassandra, which does above. However, it adds support for other features that are not essential for me. So while I like it (and will consider using it of course), I am trying to figure out if there might be other mature projects that implement what I need. Specifically, for me the only aspect of value I need is to access it as a blob. For key, however, I need range queries (as in, access values ordered, limited by start and/or end values). While values can have structures, there is no need to use that structure for anything on server side (can do client-side data binding, flexible value/content types etc). For added bonus, Cassandra style storage (journaled, all sequential writes) seems quite optimal for my use case. To help filter out answers, I have investigated some alternatives within general domain like: Voldemort (key/value, but no ordering) and CouchDB (just sharded, more batch-oriented); and am aware of systems that are not quite distributed while otherwise qualifying (bdb variants, tokyo cabinet itself (not sure if Tyrant might qualify), redis (in-memory store only)).

    Read the article

1