Search Results

Search found 59 results on 3 pages for 'hbase'.

Page 1/3 | 1 2 3  | Next Page >

  • Combining HBase and HDFS results in Exception in makeDirOnFileSystem

    - by utrecht
    Introduction An attempt to combine HBase and HDFS results in the following: 2014-06-09 00:15:14,777 WARN org.apache.hadoop.hbase.HBaseFileSystem: Create Dir ectory, retries exhausted 2014-06-09 00:15:14,780 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown. java.io.IOException: Exception in makeDirOnFileSystem at org.apache.hadoop.hbase.HBaseFileSystem.makeDirOnFileSystem(HBaseFile System.java:136) at org.apache.hadoop.hbase.master.MasterFileSystem.checkRootDir(MasterFi leSystem.java:428) at org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSyst emLayout(MasterFileSystem.java:148) at org.apache.hadoop.hbase.master.MasterFileSystem.<init>(MasterFileSyst em.java:133) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.j ava:572) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:432) at java.lang.Thread.run(Thread.java:744) Caused by: org.apache.hadoop.security.AccessControlException: Permission denied: user=hbase, access=WRITE, inode="/":vagrant:supergroup:drwxr-xr-x at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPe rmissionChecker.java:224) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPe rmissionChecker.java:204) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermi ssion(FSPermissionChecker.java:149) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(F SNamesystem.java:4891) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(F SNamesystem.java:4873) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAcce ss(FSNamesystem.java:4847) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FS Namesystem.java:3192) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNames ystem.java:3156) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesyst em.java:3137) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameN odeRpcServer.java:669) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTra nslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:419) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$Cl ientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:4497 0) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.cal l(ProtobufRpcEngine.java:453) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1752) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1748) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInforma tion.java:1438) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1746) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstruct orAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingC onstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:408) at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteExce ption.java:90) at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteExc eption.java:57) at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:2153) at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:2122) at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSy stem.java:545) at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1915) at org.apache.hadoop.hbase.HBaseFileSystem.makeDirOnFileSystem(HBaseFile System.java:129) ... 6 more while configuration and system settings are as follows: [vagrant@localhost hadoop-hdfs]$ hadoop fs -ls hdfs://localhost/ Found 1 items -rw-r--r-- 3 vagrant supergroup 1010827264 2014-06-08 19:01 hdfs://localhost/u buntu-14.04-desktop-amd64.iso [vagrant@localhost hadoop-hdfs]$ /etc/hadoop/conf/core-site.xml <configuration> <property> <name>fs.defaultFS</name> <value>hdfs://localhost:8020</value> </property> </configuration> /etc/hbase/conf/hbase-site.xml <configuration> <property> <name>hbase.rootdir</name> <value>hdfs://localhost:8020/hbase</value> </property> <property> <name>hbase.cluster.distributed</name> <value>true</value> </property> </configuration> /etc/hadoop/conf/hdfs-site.xml <configuration> <property> <name>dfs.name.dir</name> <value>/var/lib/hadoop-hdfs/cache</value> </property> <property> <name>dfs.data.dir</name> <value>/tmp/hellodatanode</value> </property> </configuration> NameNode directory permissions [vagrant@localhost hadoop-hdfs]$ ls -ltr /var/lib/hadoop-hdfs/cache total 8 -rwxrwxrwx. 1 hbase hdfs 15 Jun 8 23:43 in_use.lock drwxrwxrwx. 2 hbase hdfs 4096 Jun 8 23:43 current [vagrant@localhost hadoop-hdfs]$ HMaster is able to start if fs.defaultFS property has been commented in core-site.xml NameNode is listening [vagrant@localhost hadoop-hdfs]$ netstat -nato | grep 50070 tcp 0 0 0.0.0.0:50070 0.0.0.0:* LIST EN off (0.00/0/0) tcp 0 0 33.33.33.33:50070 33.33.33.1:57493 ESTA BLISHED off (0.00/0/0) and accessible by navigating to http://33.33.33.33:50070/dfshealth.jsp. Question How to solve makeDirOnFileSystem exception and let HBase connect to HDFS?

    Read the article

  • unable to start regionservers in HBase

    - by amsal
    I have a problem starting regionservers on slave PCs. When I enlist only master PC in conf/regionservers everything works fine but when I add two more slaves to it the hbase does not start. If I delete all hbase folders in the tmp folder from all PCs and then start regionserver (with 3 regionservers enlisted) the hbase gets started but when I try to create a table it again fails (gets stuck). I am using hadoop 0.20.0 which is working fine and hbase 0.92.0. I have 3 PCs in cluster, one master and two slaves. Also tell that is DNS (both forward and backward lookup working) necessary for hbase in my case? Is there any way to replicate hbase table to all region servers? I.e. I want to have a copy of table at each PC and want to access them locally (when I execute map task they should use their local copy of hbase table).

    Read the article

  • Clojure / HBase: How to Import HBaseTestingUtility in v0.94.6.1

    - by David Williams
    In Clojure, if I want to start a test cluster using the hbase testing utility, I have to annotate my dependencies with: [org.apache.hbase/hbase "0.92.2" :classifier "tests" :scope "test"] First of all, I have no idea what this means. According to leiningens sample project.clj ;; Dependencies are listed as [group-id/name version]; in addition ;; to keywords supported by Pomegranate, you can use :native-prefix ;; to specify a prefix. This prefix is used to extract natives in ;; jars that don't adhere to the default "<os>/<arch>/" layout that ;; Leiningen expects. Question 1: What does that mean? Question 2: If I upgrade the version: [org.apache.hbase/hbase "0.94.6.1" :classifier "tests" :scope "test"] Then I receive a ClassNotFoundException Exception in thread "main" java.lang.ClassNotFoundException: org.apache.hadoop.hbase.HBaseConfiguration Whats going on here and how do I fix it?

    Read the article

  • Cannot access data from hbase with amazone ec2

    - by Najeeb Thalakkatt
    I have a single node hadoop machine in which hbase is running in the amazone ec2 instance. Due to some reason the server got restarted. So i need to start hadoop and hbase again. Then its working fine, but the old data in the hbase cannot be accessed through the web-servies. While i use the shell command its working fine, i am getting the data. So i created the scenario on my local server machine, but its working fine. The version details are as follows. hadoop-0.20.2 hbase-0.90.5 apache-tomcat-7.0.30 in Amazon ec2 medium instance And i use restful web-services with Orm-hbase to access data.

    Read the article

  • Java process failure (hadoop, hbase)

    - by Vladimir
    Anytime I am running hadoop/hbase process from a command prompt I get an error: /usr/local/hadoop/bin/hadoop: line 320: /usr/lib/jvm/jdk1.7.0/bin/java: cannot execute binary file /usr/local/hadoop/bin/hadoop: line 390: /usr/lib/jvm/jdk1.7.0/bin/java: cannot execute binary file /usr/local/hadoop/bin/hadoop: line 390: /usr/lib/jvm/jdk1.7.0/bin/java: Success I get the same kind of error when I start hbase. java version "1.7.0_07" Java(TM) SE Runtime Environment (build 1.7.0_07-b10) Java HotSpot(TM) Server VM (build 23.3-b01, mixed mode) Could you please tell me what could cause the issue? Thank you

    Read the article

  • facing problems while updating rows in hbase

    - by sammy
    Hello i've just started exploring hbase i've run samples : SampleUploader,PerformanceEvaluation and rowcount as given in hadoop wiki: http://wiki.apache.org/hadoop/Hbase/MapReduce The problem im facing is : table1 is my table with the column family column create 'table1','column' put 'table1','row1','column:address','SanFrancisco' hbase(main):020:0 scan 'table1' ROW COLUMN+CELL row1 column=column:address, timestamp=1276351974560, value=SanFrancisco put 'table1','row1','column:name','Hannah' hbase(main):020:0 scan 'table1' ROW COLUMN+CELL row1 column=column:address,timestamp=1276351974560,value=SanFrancisco row1 column=column:name, timestamp=1276351899573, value=Hannah i want both the columns to appear in the same row as a different version similary, if i change the name column to sarah, it shows the updated row.... but i want both the old row and the changed row to appear as 2 different versions so that i could make analysis on the data........ whatis the mistake im making???? thank u a lot sammy

    Read the article

  • python hbase exception

    - by kula
    when i use client.mutateRow(self.tableName, row, mutations) to write data to hbase . there is a exception, IOError: IOError(message="Trying to contact region server Some server, retryOnlyOne=true, index=0, islastrow=true, tries=9, numtries=10, i=0, listsize=1, region=test,,1276665207312 for region test,,1276665207312, row 'hello', but failed after 10 attempts.\nExceptions:\n") i use http://pypi.python.org/pypi/hbase-thrift/0.20.4 to write hbase. seems it is a library bug. anyone can help me ?

    Read the article

  • HBase as web app backend

    - by NathanD
    Can anyone advise if it is a good idea to have HBase as primary data source for web-based application? My primary concern is HBase's response time to queries. Is it possible to have sub-second response? edit: more details about the app itself. Amount of data: ~500GB of text data, expect to reach 1TB soon Number of concurrent users using the app: up to 50 The app will be used to present reports about data stored in HBase, like how many times keyword "X" occured in last 24h. For ~80% of requests from that app I will know the exact key, 20% will be scans (I'm looking into HBase schema design related topics to make it run fast)

    Read the article

  • Advanced queries in HBase

    - by Teflon Ted
    Given the following HBase schema scenario (from the official FAQ)... How would you design an Hbase table for many-to-many association between two entities, for example Student and Course? I would define two tables: Student: student id student data (name, address, ...) courses (use course ids as column qualifiers here) Course: course id course data (name, syllabus, ...) students (use student ids as column qualifiers here) This schema gives you fast access to the queries, show all classes for a student (student table, courses family), or all students for a class (courses table, students family). How would you satisfy the request: "Give me all the students that share at least two courses in common"? Can you build a "query" in HBase that will return that set, or do you have to retrieve all the pertinent data and crunch it yourself in code?

    Read the article

  • Hbase / Hadoop Query Help

    - by zechariahs
    I'm working on a project with a friend that will utilize Hbase to store it's data. Are there any good query examples? I seem to be writing a ton of Java code to iterate through lists of RowResult's when, in SQL land, I could write a simple query. Am I missing something? Or is Hbase missing something?

    Read the article

  • writing hbase reports

    - by sammy
    Hello i've just started exploring hbase i've run 2 samples : SampleUploader from examples and PerformanceEvaluation as given in hadoop wiki: http://wiki.apache.org/hadoop/Hbase/MapReduce Well, my application involves updating huge amount of data once a day.. i need samples that store and retrieve rows based on timestamp and make an analysis over the data could u please provide r poinnters as to how to continue as i dont find many tutorials on samples using TIMESTAMP thank u a lot sammy

    Read the article

  • How does Hive compare to HBase?

    - by mrhahn
    I'm interested in finding out how the recently-released (http://mirror.facebook.com/facebook/hive/hadoop-0.17/) Hive compares to HBase in terms of performance. The SQL-like interface used by Hive is very much preferable to the HBase API we have implemented.

    Read the article

  • Hbase and 1- Many Relation

    - by Shuja
    Hi All I have one question which can be best described by the following scenario. Suppose I have three tables BaseCategory,Category and products. If i am thinking in terms of RDBMS then the relationship amoung these tables are 1- One BaseCategory has Many categories 2- One Category has Many Products. Now i am thinking to convert it into HBase. can anybody help me how to map these relations into HBase?

    Read the article

  • HBase schema help

    - by Jody Powlette
    Coming from a SQL Server background, I'm a newbie with regard to HBase, but the technology looks to be a good fit for what we're doing and the cost is definitely right! I need to maintain a list of log entries which normally I would create in an RDBS as: create table Log ( UserID int, SiteID int, Page varchar(50), Date smalldatetime ) where one user may have 0 or 1000 rows in this simple table. Typical queries would be to find all the rows for one user or all the rows for one user on one site. How does this translate into a "map" in HBase where there is no "row key" AND the same (SiteID,Page) may appear many times. My first thought is that UserID is a row key, but I still don't understand "column families" and the other terminology well enough to understand how to setup the table to hold this data where the one UserID can have many (SiteID,Page,Date) "rows". Any direction is appreciated!

    Read the article

  • HBase as a multimap

    - by Ibrahim
    Hi guys, I'm doing some large scale text processing work and I'm trying to get started with Hadoop and HBase. One of the things I need to do is build a multimap of some stuff, which I later use to look up things and get all items with a certain key (in a M/R job). Would it be OK to use HBase and insert many rows with the same key and rely on versions/timestamps to achieve a multimap-like setup or is this a bad idea? The multimap is built up in the reduce phase of a Mapreduce task by the way, or at least in the way I've formulated it on paper. Thanks! If more information is needed, I'd be happy to provide it. Not sure whether this question is clear.

    Read the article

  • retrieving multiple versions through API through hbase

    - by sammy
    hello , this is a continuation of my previous question where id used hbase shell.. http://stackoverflow.com/questions/3024417/facing-problems-while-updating-rows-in-hbase i tried the same with API.. im not able to figure out how to retrieve all versions , iterate and print their values for a specific row... i've spending hours reading... please help me out... Scan s = new Scan(Bytes.toBytes("row1")); s.addColumn(Bytes.toBytes("column"),Bytes.toBytes("address")); SETTING RANGE FOR THE VERSIONS s.setTimeRange(0L,6L); ResultScanner scanner = table.getScanner(s); for (Result r : scanner) { for(KeyValue kv : r.sorted()) { System.out.println("To"+kv.getTimestamp()); System.out.println("from "+Bytes.toString(kv.getKey())); System.out.println("To "+Bytes.toString(kv.getValue())); } scanner.close(); } here im intending to print all versions of the column..... but it gives the most recent one... im stuck here...

    Read the article

  • Could not start ZK at requested port of 2181, while export HBASE_MANAGES_ZK=false

    - by utrecht
    Problem The first aim was to run HBase standalone. Navigating to ip:60010/master-status is succesfull once HBase has been started. The second aim is to run a distinct ZooKeeper quorum. ZooKeeper has been downloaded and has been started: netstat -nato | grep 2181 tcp 0 0 :::2181 :::* LISTEN off (0.00/0/0) The conf/hbase-env.sh was changed as follows: # Tell HBase whether it should manage it's own instance of Zookeeper or not. export HBASE_MANAGES_ZK=false in order to avoid HBase starts ZooKeeper once HBase has been started. However, the following error occurs once HBase has been started. Could not start ZK at requested port of 2181. ZK was started at port: 2182. Aborting as clients (e.g. shell) will not be able to find this ZK quorum. Question How to disable the startup of ZooKeeper by HBase and run ZooKeeper separately?

    Read the article

  • Cassandra/HBase or just MySQL: Potential problems doing the next thing

    - by alexeypro
    Say I have "user". It's the key. And I need to keep "user count". I am planning to have record with key "user" and value "0" to "9999+ ;-)" (as many as I'll have). What problems I will drive in if I use Cassandra, HBase or MySQL for that? Say, I have thousand of new updates to this "user" key, where I need to increment the value. Am I in trouble? Locked for writes? Any other way of doing that? Why this is done -- there will be a lot of "user"-like keys. Different other cases. But the idea is the same. Why keep it this way -- because I'll have more reads, so I can always get "counted value" very fast.

    Read the article

  • Hbase schema design -- to make sorting easy?

    - by chen
    I have 1M words in my dictionary. Whenever a user issue a query on my website, I will see if the query contains the words in my dictionary and increment the counter corresponding to them individually. Here is the example, say if a user type in "Obama is a president" and "Obama" and "president" are in my dictionary, then I should increment the counter by 1 for "Obama" and "president". And from time to time, I want to see the top 100 words (most queried words). If I use Hbase to store the counter, what schema should I use? -- I have not come up an efficient one yet. If I use word in my dictionary as row key, and "counter" as column key, then updating counter(increment) is very efficient. But it's very hard to sort and return the top 100. Anyone can give a good advice? Thanks.

    Read the article

  • Help regarding no sql databases like hadoop, hbase etc

    - by user560370
    I am new to the distributed NoSQL databases like Hadoop, Cassandra, etc. I have few questions for which I seek an expert advice: Can you list problems/challenges one will generally face when making a shift from the present conventional database like MySQL to these large cluster-based databases? What are the difficulties, if any, when one needs to adapt to a newer version of these open source projects? Can you list out the things which are generally stored/kept in memcached for fast rendering of the page? How can I understand the source code of open-source projects so that I can build on it and maybe give back to the community? Above questions may sound to be idiotic and basic but please it's a request for the experts to answer the above questions in detailed and to best of their abilities.

    Read the article

  • Using HBase or Cassandra for a token server

    - by crippy
    I've been trying to figure out how to use HBase/Cassandra for a token system we're re-implementing. I can probably squeeze quite a lot more from MySQL, but it just seems it has come to clinging on to the wrong tool for the task just because we know it well. Eventually will hit a wall (like happened to us in other areas). Naturally I started looking into possible NoSQL solutions. The prominent ones (at least in terms of buzz) are HBase and Cassandra. The story is more or less like this: A user can send a gift other users. Each gift has a list of recipients or is public in which case limited by number or expiration date For each gift sent we generate some token that uniquely identifies that gift. For each gift we track the list of potential recipients and their current status relating to that gift (accepted, declinded etc). A user can request to see all his currently pending gifts A can request a list of users he has sent a gift to today (used to limit number of gifts sent) Required the ability to "dump" or "ignore" expired gifts (x day old gifts are considered expired) There are some other requirements but I believe the above covers the essentials. How would I go and model that using HBase or Cassandra? Well, the wall was performance. A few 10s of millions of records per day over 2 tables kept for 2 weeks (wish I could have kept it for more but there was no way). The response times kept getting slower and slower until eventually we had to start cutting down number of days we kept data. Caching helps here but it's not an ideal solution since a big part of the ops are updates. Also, as I hinted in my original post. We use MySQL extensively. We know exactly what it can and can't do both in naive implementations followed by native partitioning and finally by horizontally sharding our dataset on the application level to reside on multiple DB nodes. It can be done, but that's not really what I'm trying to get from this. I asked a very specific question about designing a solution using a NoSQL solution since it's very hard to find examples for designs out there. Brainlag, not trying to come off as rude. I actually appreciate it a lot that you are the only one who even bothered to respond. but I see it over and over again. People ask questions and others assume they have no idea what they're talking about and give an irrelevant answer. Ignore RDBMS please. The question is about nosql.

    Read the article

  • CouchDB, HDFS, HBase or which is right for my situation?

    - by Lucas
    Hello all, This question is regarding data storage systems such as CouchDB, HDFS and HBase, specifically, which is right. I am looking at making a simple and customized Document Management System for my organization. Basically, we need the ability to store some Word Documents, PDFs and other similar files. I also want to store metadata about these files (e.g., Author, Dates, etc). Usage permissions would also be handy, but that can probably be built using meta-data. I would also need the ability to full-text index. The ability to version, while not required would be extremely useful. I would like the ability to simply add hardware to expand the resources of the system and the system must support Network Attached Storage over the CIFS or NFS protocol(s). I have read about CouchDB, HDFS and HBase. My preferred programming language is C# as all of my end-users will be running Windows machines and I will want to make both web and winforms client implementations. My question is which solution best fits my needs? Based on my research it appears that CouchDB (utilizing the CouchDB-Lounge and CouchDB-Lucene) perfectly fits my needs. However, I am worried that since I have worked with CouchDB that I might be overlooking something useful for my needs in HDFS or HBase or something similar due to a bias. Any and all opinions are welcome as I am looking for the community input as I really do not want to make the wrong choice at the start of my project. Please ask if you need more information. I thank you all for your time, input and assistance.

    Read the article

1 2 3  | Next Page >