Search Results

Search found 371 results on 15 pages for 'cassandra clark'.

Page 6/15 | < Previous Page | 2 3 4 5 6 7 8 9 10 11 12 13  | Next Page >

  • Database that consumes less disk space

    - by Hugo Palma
    I'm looking at solutions to store a massive quantity of information consuming the less possible disk space. The information structure is very simple and the queries will also be very simple. I've looked at solutions like Apache Cassandra and relations databases but couldn't find a comparison where disk usage is mentioned. Any ideas on this would be great.

    Read the article

  • Column-oriented DBMS and JOIN operations

    - by André
    From some of the research I've done on NoSQL, column-oriented databases (like HBase or Cassandra) seem to solve the problem of costly JOIN operations, but I don't get how this approach solves this problem. Can anyone explain it to me and/or link me to interesting documentation regarding this area? Thanks

    Read the article

  • 0 connected nodes in datastax opscenter

    - by gansbrest
    Installed opscenterd on the separate node outside of the cluster, but within firewall ( aws security group ). Tested all possible ports between agents and opcenter server. No errors in the log.. 2013-10-30 01:07:23+0000 [FC_Cluster] INFO: Initializing event storage. 2013-10-30 01:07:23+0000 [FC_Cluster] INFO: Attempting to load all persisted alert rules 2013-10-30 01:07:23+0000 [FC_Cluster] INFO: Done loading persisted alert rules 2013-10-30 01:07:23+0000 [FC_Cluster] INFO: Done initializing event storage. 2013-10-30 01:07:23+0000 [FC_Cluster] INFO: Done loading persisted scheduled job descriptions 2013-10-30 01:07:23+0000 [FC_Cluster] INFO: OpsCenter starting up. 2013-10-30 01:07:23+0000 [] INFO: Finished starting new cluster services for FC_Cluster 2013-10-30 01:08:04+0000 [FC_Cluster] INFO: Agent for ip 10.34.10.185 is version u'3.2.2' 2013-10-30 01:08:04+0000 [FC_Cluster] INFO: Agent for ip 10.32.37.251 is version u'3.2.2' 2013-10-30 01:08:04+0000 [FC_Cluster] INFO: Agent for ip 10.82.226.252 is version u'3.2.2' The most interesting part that I can see some data in the opscenter UI, when I stop agents, there is no data displayed, when I start - it show up again, but at the same time it shows 0 connected nodes. Storage capacity is even funnier - 3 of 0 nodes.. Any ideas why that could be happening?

    Read the article

  • which NoSQL for billions of records [closed]

    - by airtruk
    There are plenty of discussions around NoSQL databases around and a lot of them are about data logging in the social media section. The problem I'm trying to solve falls more into the scientific computing section, where I have several 1000s of billions of pieces of information that I want to query with different a different criteria for each query. All data is at least a 4 dimensional space, which means I have a 3D location (x,y,z) and a time component - plus the value and unit. Say temperature at xyz and 10min in degree Celcius. A typical query result may contain several million results ... I have read about pretty much all NoSQL solutions being exceptionally fast for inserting records, but when it comes to querying them it's a different story. I'm leaning towards MongoDB for the implementation and platform for developing the necessary code since it is more closely related to the current solution using MySQL. Happy to be proven wrong though when it comes to the choice of the NoSQL solution.

    Read the article

  • How can I get started with BigData?

    - by ????? ????????
    I have a programming background, and I've done lots of database design and written lots of queries with Sql Server. I am really excited about looking at bigdata solutions. I know almost nothing about it. The way I want to learn is to sign up for a sandbox where I can try things out. questions Is there a sandbox where I can play around with hadoop? It does not have to be free. Would amazon EMR be the right path to go? What technologies should I be looking at to get started quickly? Is there a 'bigdata' dataset that is available to play with? Thank you so much for your guidance.

    Read the article

  • Difference between Document-oriented-DB and Bigtable clones

    - by chen
    We are looking for a suitable storage engine for our weblog history data. We looked at Bigtable's paper and understand it is suitable to us well. However, I also understand that Document-oriented-DB such as MongoDB seems to provide a little more powerful schema power -- i.e, it can model our data as well. I wonder how nowadays ppl choose a scalable NoSQL DB --- I read enough articles like "we looked at A, B and C, and we decided to use C". But I'd like to see some benchmark number. What I am saying is that if MongoDB and the like can provide same level of performance as Bigtable clones, why don't web companies choose it (preparing to deal with various potentially more complex data problem)? Thanks, By the way, I read an article (which convinced me at the moment) saying Cassandra does not fit the M/R operation, any comments?

    Read the article

  • How should I best store these files?

    - by Triton Man
    I have a set of image files, they are generally very small, between 5k and 100k. They can be any size though, upwards of 50mb but this is very rare. When these images are put into the system they are not ever modified. There is about 50 TB of these images total. They are currently chunked and stored in BLOBs in Oracle, but we want to change this since it requires special software to extract them. These images are access sometimes at a rate of over 100 requests per second among about 10 servers. I'm thinking about Hadoop or Cassandra, but I really don't know which would be best or how best to index them.

    Read the article

  • Why are my Lucene Document results empty?

    - by vegashacker
    I'm running a simple test--trying to index something and then search for it. I index a simple document, but then when a search for a string in it, I get back what looks to be an empty document (it has no fields). Lucene seems to be doing something, because if I search for a word that's not in the document, it returns 0 results. Any reason why Lucene would reliably return a document when it finds one that matches the given query, and yet that document has nothing in it? Thanks! PS: I'm actually running Lucandra (Lucene + Cassandra). That certainly may be a relevant detail, but not sure.

    Read the article

  • sphinx xmlpipe2 cassandra and ruby 1.9

    - by user369083
    Hi, I start to using cassandra and I want to index my db with sphinx. I wrote ruby script which is used as xmlpipe, and I configure sphinx to use it. source xmlsrc { type = xmlpipe2 xmlpipe_command = /usr/local/bin/ruby /home/httpd/html/app/script/sphinxpipe.rb } When I run script from console output looks fine, but when I run indexer sphinx return error $ indexer test_index Sphinx 0.9.9-release (r2117) Copyright (c) 2001-2009, Andrew Aksyonoff using config file '/usr/local/etc/sphinx.conf'... indexing index 'test_index'... ERROR: index 'test_index': source 'xmlsrc': attribute 'id' required in <sphinx:document> (line=10, pos=0, docid=0). total 0 docs, 0 bytes total 0.000 sec, 0 bytes/sec, 0.00 docs/sec total 0 reads, 0.000 sec, 0.0 kb/call avg, 0.0 msec/call avg total 0 writes, 0.000 sec, 0.0 kb/call avg, 0.0 msec/call avg my script is very simple $stdout.sync = true puts %{<?xml version="1.0" encoding="utf-8"?>} puts %{<sphinx:docset>} puts %{<sphinx:schema>} puts %{<sphinx:field name="body"/>} puts %{</sphinx:schema>} puts %{<sphinx:document id="ba32c02e-79e2-11df-9815-af1b5f766459">} puts %{<body><![CDATA[aaa]]></body>} puts %{</sphinx:document>} puts %{</sphinx:docset>} I use ruby 1.9.2-head, ubuntu 10.04, sphinx 0.9.9 How can I get this to work?

    Read the article

  • What database systems should an startup company consider?

    - by Am
    Right now I'm developing the prototype of a web application that aggregates large number of text entries from a large number of users. This data must be frequently displayed back and often updated. At the moment I store the content inside a MySQL database and use NHibernate ORM layer to interact with the DB. I've got a table defined for users, roles, submissions, tags, notifications and etc. I like this solution because it works well and my code looks nice and sane, but I'm also worried about how MySQL will perform once the size of our database reaches a significant number. I feel that it may struggle performing join operations fast enough. This has made me think about non-relational database system such as MongoDB, CouchDB, Cassandra or Hadoop. Unfortunately I have no experience with either. I've read some good reviews on MongoDB and it looks interesting. I'm happy to spend the time and learn if one turns out to be the way to go. I'd much appreciate any one offering points or issues to consider when going with none relational dbms?

    Read the article

  • NoSQL: How to retrieve a 'house' based on lat & long?

    - by Tedk
    I have a NoSQL system for storing real estate houses. One piece of information I have in my key-value store for each house is the longitude and latitude. If I wanted to retrieve all houses within a geo-lat/long box, like the SQL below: SELECT * from houses WHERE latitude IS BETWEEN xxx AND yyy AND longitude IS BETWEEN www AND zzz Question: How would I do this type of retrival with NoSQL ... using just a key-value store system? Even if I could do this with NoSQL, would it even be efficient or would simply going back to using a tradition database retrieve this type of information faster?

    Read the article

  • inheritance in document database?

    - by nils petersohn
    i am wondering because i searched the pdf "xxx the definitive guide" and "beginning xxx" for the word "inheritance" but i didn't find anything? am i missing something? because i am doing a tablePerHierarchy inheritance with hibernate and mysql, does that become deprecated for some reason in xxx? (replace xxx with the "not only sql" database you like)

    Read the article

  • Duplicate partitioning key performance impact

    - by Anshul
    I've read in some posts that having duplicate partitioning key can have a performance impact. I've two tables like: CREATE TABLE "Test1" ( CREATE TABLE "Test2" ( key text, key text, column1 text, name text, value text, age text, PRIMARY KEY (key, column1) ... ) PRIMARY KEY (key, name,age) ) In Test1 column1 will contain column name and value will contain its corresponding value.The main advantage of Test1 is that I can add any number of column/value pairs to it without altering the table by just providing same partitioning key each time. Now my question is how will each of these table schema's impact the read/write performance if I've millions of rows and number of columns can be upto 50 in each row. How will it impact the compaction/repair time if I'm writing duplicate entries frequently?

    Read the article

  • Database/NoSQL - Lowest latecy way to retreive the following data...

    - by Nickb
    I have a real estate application and a "house" contains the following information: house: - house_id - address - city - state - zip - price - sqft - bedrooms - bathrooms - geo_latitude - geo_longitude I need to perform an EXTREMELY fast (low latency) retrieval of all homes within a geo-coordinate box. Something like the SQL below (if I were to use a database): SELECT * from houses WHERE latitude IS BETWEEN xxx AND yyy AND longitude IS BETWEEN www AND zzz Question: What would be the quickest way for me to store this information so that I can perform the fastest retrieval of data based on latitude & longitude? (e.g. database, NoSQL, memcache, etc)?

    Read the article

  • mysql stored routine vs. mysql-alternative?

    - by user522962
    We are using a mysql database w/ about 150,000 records (names) total. Our searches on the 'names' field is done through an autocomplete function in php. We have the table indexed but still feel that the searching is a bit sluggish (a few full seconds vs. something like Google Finance w/ near-instant response). We came up w/ 2 possibilities, but wanted to get more insight: Can we create a bunch (many thousands or more) of stored procedures to speed up searches, or will creating that many stored procedures bog-down the db? Is there a faster alternative to mysql for "select" statements (speed on inserting & updating rows isn't too important so we can sacrifice that, if necessary). I've vaguely heard of BigTable & others that don't support JOIN statements....we need JOIN statements for some of our other queries we do. thx

    Read the article

  • Apache Cassandra en version 0.6.0 est disponible : gain de performances de 30 % pour la base de donn

    Apache Cassandra en version 0.6.0 Est disponible avec un gain de performances de 30 % pour la base de données NoSQL Cassendra, la désormais célèbre base de données non relationnelle (NoSQL) et open source soutenue par la Fondation Apache connait une nouvelle étape de son développement avec l'arrivée de la version 0.6. Le but de ce type de SGBD est de fournir un modèle décentralisé susceptible de répondre à des besoins important de scalabilité. Un concept qui n'est pas sans créer un certain débat dans la communauté des ba...

    Read the article

  • Merge\Combine two datatables

    - by madlan
    I'm trying to merge\combine two datatables. I've looked at various examples and answers but they seem to create duplicate rows or require indexes (merge on datatable etc) I can't do this via SQL as one source is from a linked Oracle server accessed via MSSQL and the other from a different MSSQL Server that does not have linked access. The data is currently very simple: Name, Email, Phone DataTable1: "John Clark", "", "01522 55231" "Alex King", "[email protected]", "01522 55266" "Marcus Jones", "[email protected]", "01522 55461" DataTable2: "John Clark", "[email protected]", "01522 55231" "Alex King", "[email protected]", "" "Marcus Jones", "[email protected]", "01522 55461" "Warren bean", "[email protected]", "01522 522311" Giving a datatable with the following: "John Clark", "[email protected]", "01522 55231" "Alex King", "[email protected]", "01522 55266" "Marcus Jones", "[email protected]", "01522 55461" "Warren bean", "[email protected]", "01522 522311" Name is the field to match records on, with the first datatable taking priority.

    Read the article

  • Java EE suitablity for a social network using Cassandra datastore ??

    - by Marcos
    We are in the process of making some important technology decisions for a social networking application. We're planning to have Cassandra(a NoSQL database to support efficient data storage). We would be using Hector(a Java client) to interact with Cassandra. 1.) Would Java EE be a good choice over PHP for a social networking application in terms of performance, scalabilty & complexities? 2.) Another possible implementation strategy, Is it suitable to have backend alone in Java and rest in PHP? 3.) What differences(as compared to PHP) it makes in terms of costs at various stages of application development, deployment and maintenance ? 4.) What are the things to keep in mind as we move along with Java development& deployment(as we are relatively new to the Java background) ? 5.) If you could list some major production deployments of similar type(social network) applications in Java. Thank you!

    Read the article

< Previous Page | 2 3 4 5 6 7 8 9 10 11 12 13  | Next Page >