Search Results

Search found 160 results on 7 pages for 'hive'.

Page 1/7 | 1 2 3 4 5 6 7  | Next Page >

  • Problem compiling hive with ant

    - by conandor
    I compiling with Solaris 10 SPARC, jdk 1.6 from Sun, Ant 1.7.1 from OpenCSW. I have no problem running hadoop 0.17.2.1 However, I have problem compiling/integrating hive with the error 'cannot find symbol', although I followed the tutorial. I have the hive source code from SVN exactly from tutorial. How can I know the hive version I compiling and how can I compile against hadoop 0.17.2.1? Please advice. Thank you. -bash-3.00$ export PATH=/usr/jdk/instances/jdk1.6.0/bin:/usr/bin:/opt/csw/bin:/opt/webstack/bin -bash-3.00$ export JAVA_HOME=/usr/jdk/instances/jdk1.6.0 -bash-3.00$ export HADOOP=/export/home/mywork/hadoop-0.17.2.1/bin/hadoop -bash-3.00$ /opt/csw/bin/ant package -Dhadoop.version=0.17.2.1 Buildfile: build.xml jar: create-dirs: compile-ant-tasks: create-dirs: init: compile: [echo] Compiling: anttasks deploy-ant-tasks: create-dirs: init: compile: [echo] Compiling: anttasks jar: init: compile: ivy-init-dirs: ivy-download: [get] Getting: http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.1.0/ivy-2.1.0.jar [get] To: /export/home/mywork/hive/build/ivy/lib/ivy-2.1.0.jar [get] Not modified - so not downloaded ivy-probe-antlib: ivy-init-antlib: ivy-init: ivy-retrieve-hadoop-source: [ivy:retrieve] :: Ivy 2.1.0 - 20090925235825 :: http://ant.apache.org/ivy/ :: [ivy:retrieve] :: loading settings :: file = /export/home/mywork/hive/ivy/ivysettings.xml [ivy:retrieve] :: resolving dependencies :: org.apache.hadoop.hive#shims;[email protected] [ivy:retrieve] confs: [default] [ivy:retrieve] found hadoop#core;0.17.2.1 in hadoop-source [ivy:retrieve] found hadoop#core;0.18.3 in hadoop-source [ivy:retrieve] found hadoop#core;0.19.0 in hadoop-source [ivy:retrieve] found hadoop#core;0.20.0 in hadoop-source [ivy:retrieve] :: resolution report :: resolve 25878ms :: artifacts dl 37ms --------------------------------------------------------------------- | | modules || artifacts | | conf | number| search|dwnlded|evicted|| number|dwnlded| --------------------------------------------------------------------- | default | 4 | 0 | 0 | 0 || 4 | 0 | --------------------------------------------------------------------- [ivy:retrieve] :: retrieving :: org.apache.hadoop.hive#shims [ivy:retrieve] confs: [default] [ivy:retrieve] 0 artifacts copied, 4 already retrieved (0kB/82ms) install-hadoopcore-internal: build_shims: [echo] Compiling shims against hadoop 0.17.2.1 (/export/home/mywork/hive/build/hadoopcore/hadoop-0.17.2.1) ivy-init-dirs: ivy-download: [get] Getting: http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.1.0/ivy-2.1.0.jar [get] To: /export/home/mywork/hive/build/ivy/lib/ivy-2.1.0.jar [get] Not modified - so not downloaded ivy-probe-antlib: ivy-init-antlib: ivy-init: ivy-retrieve-hadoop-source: [ivy:retrieve] :: Ivy 2.1.0 - 20090925235825 :: http://ant.apache.org/ivy/ :: [ivy:retrieve] :: loading settings :: file = /export/home/mywork/hive/ivy/ivysettings.xml [ivy:retrieve] :: resolving dependencies :: org.apache.hadoop.hive#shims;[email protected] [ivy:retrieve] confs: [default] [ivy:retrieve] found hadoop#core;0.17.2.1 in hadoop-source [ivy:retrieve] found hadoop#core;0.18.3 in hadoop-source [ivy:retrieve] found hadoop#core;0.19.0 in hadoop-source [ivy:retrieve] found hadoop#core;0.20.0 in hadoop-source [ivy:retrieve] :: resolution report :: resolve 12041ms :: artifacts dl 30ms --------------------------------------------------------------------- | | modules || artifacts | | conf | number| search|dwnlded|evicted|| number|dwnlded| --------------------------------------------------------------------- | default | 4 | 0 | 0 | 0 || 4 | 0 | --------------------------------------------------------------------- [ivy:retrieve] :: retrieving :: org.apache.hadoop.hive#shims [ivy:retrieve] confs: [default] [ivy:retrieve] 0 artifacts copied, 4 already retrieved (0kB/39ms) install-hadoopcore-internal: build_shims: [echo] Compiling shims against hadoop 0.18.3 (/export/home/mywork/hive/build/hadoopcore/hadoop-0.18.3) ivy-init-dirs: ivy-download: [get] Getting: http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.1.0/ivy-2.1.0.jar [get] To: /export/home/mywork/hive/build/ivy/lib/ivy-2.1.0.jar [get] Not modified - so not downloaded ivy-probe-antlib: ivy-init-antlib: ivy-init: ivy-retrieve-hadoop-source: [ivy:retrieve] :: Ivy 2.1.0 - 20090925235825 :: http://ant.apache.org/ivy/ :: [ivy:retrieve] :: loading settings :: file = /export/home/mywork/hive/ivy/ivysettings.xml [ivy:retrieve] :: resolving dependencies :: org.apache.hadoop.hive#shims;[email protected] [ivy:retrieve] confs: [default] [ivy:retrieve] found hadoop#core;0.17.2.1 in hadoop-source [ivy:retrieve] found hadoop#core;0.18.3 in hadoop-source [ivy:retrieve] found hadoop#core;0.19.0 in hadoop-source [ivy:retrieve] found hadoop#core;0.20.0 in hadoop-source [ivy:retrieve] :: resolution report :: resolve 11107ms :: artifacts dl 36ms --------------------------------------------------------------------- | | modules || artifacts | | conf | number| search|dwnlded|evicted|| number|dwnlded| --------------------------------------------------------------------- | default | 4 | 0 | 0 | 0 || 4 | 0 | --------------------------------------------------------------------- [ivy:retrieve] :: retrieving :: org.apache.hadoop.hive#shims [ivy:retrieve] confs: [default] [ivy:retrieve] 0 artifacts copied, 4 already retrieved (0kB/49ms) install-hadoopcore-internal: build_shims: [echo] Compiling shims against hadoop 0.19.0 (/export/home/mywork/hive/build/hadoopcore/hadoop-0.19.0) ivy-init-dirs: ivy-download: [get] Getting: http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.1.0/ivy-2.1.0.jar [get] To: /export/home/mywork/hive/build/ivy/lib/ivy-2.1.0.jar [get] Not modified - so not downloaded ivy-probe-antlib: ivy-init-antlib: ivy-init: ivy-retrieve-hadoop-source: [ivy:retrieve] :: Ivy 2.1.0 - 20090925235825 :: http://ant.apache.org/ivy/ :: [ivy:retrieve] :: loading settings :: file = /export/home/mywork/hive/ivy/ivysettings.xml [ivy:retrieve] :: resolving dependencies :: org.apache.hadoop.hive#shims;[email protected] [ivy:retrieve] confs: [default] [ivy:retrieve] found hadoop#core;0.17.2.1 in hadoop-source [ivy:retrieve] found hadoop#core;0.18.3 in hadoop-source [ivy:retrieve] found hadoop#core;0.19.0 in hadoop-source [ivy:retrieve] found hadoop#core;0.20.0 in hadoop-source [ivy:retrieve] :: resolution report :: resolve 9969ms :: artifacts dl 33ms --------------------------------------------------------------------- | | modules || artifacts | | conf | number| search|dwnlded|evicted|| number|dwnlded| --------------------------------------------------------------------- | default | 4 | 0 | 0 | 0 || 4 | 0 | --------------------------------------------------------------------- [ivy:retrieve] :: retrieving :: org.apache.hadoop.hive#shims [ivy:retrieve] confs: [default] [ivy:retrieve] 0 artifacts copied, 4 already retrieved (0kB/57ms) install-hadoopcore-internal: build_shims: [echo] Compiling shims against hadoop 0.20.0 (/export/home/mywork/hive/build/hadoopcore/hadoop-0.20.0) jar: [echo] Jar: shims create-dirs: compile-ant-tasks: create-dirs: init: compile: [echo] Compiling: anttasks deploy-ant-tasks: create-dirs: init: compile: [echo] Compiling: anttasks jar: init: install-hadoopcore: install-hadoopcore-default: ivy-init-dirs: ivy-download: [get] Getting: http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.1.0/ivy-2.1.0.jar [get] To: /export/home/mywork/hive/build/ivy/lib/ivy-2.1.0.jar [get] Not modified - so not downloaded ivy-probe-antlib: ivy-init-antlib: ivy-init: ivy-retrieve-hadoop-source: [ivy:retrieve] :: Ivy 2.1.0 - 20090925235825 :: http://ant.apache.org/ivy/ :: [ivy:retrieve] :: loading settings :: file = /export/home/mywork/hive/ivy/ivysettings.xml [ivy:retrieve] :: resolving dependencies :: org.apache.hadoop.hive#common;[email protected] [ivy:retrieve] confs: [default] [ivy:retrieve] found hadoop#core;0.20.0 in hadoop-source [ivy:retrieve] :: resolution report :: resolve 4864ms :: artifacts dl 13ms --------------------------------------------------------------------- | | modules || artifacts | | conf | number| search|dwnlded|evicted|| number|dwnlded| --------------------------------------------------------------------- | default | 1 | 0 | 0 | 0 || 1 | 0 | --------------------------------------------------------------------- [ivy:retrieve] :: retrieving :: org.apache.hadoop.hive#common [ivy:retrieve] confs: [default] [ivy:retrieve] 0 artifacts copied, 1 already retrieved (0kB/52ms) install-hadoopcore-internal: setup: compile: [echo] Compiling: common jar: [echo] Jar: common create-dirs: compile-ant-tasks: create-dirs: init: compile: [echo] Compiling: anttasks deploy-ant-tasks: create-dirs: init: compile: [echo] Compiling: anttasks jar: init: dynamic-serde: compile: [echo] Compiling: hive [javac] Compiling 167 source files to /export/home/mywork/hive/build/serde/classes [javac] /export/home/mywork/hive/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorFactory.java:30: cannot find symbol [javac] symbol : class PrimitiveObjectInspectorFactory [javac] location: package org.apache.hadoop.hive.serde2.objectinspector.primitive [javac] import org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory; [javac] ^ [javac] /export/home/mywork/hive/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorFactory.java:31: cannot find symbol [javac] symbol : class PrimitiveObjectInspectorUtils [javac] location: package org.apache.hadoop.hive.serde2.objectinspector.primitive [javac] import org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils; [javac] ^ [javac] /export/home/mywork/hive/serde/src/java/org/apache/hadoop/hive/serde2/MetadataTypedColumnsetSerDe.java:31: cannot find symbol [javac] symbol : class MetadataListStructObjectInspector [javac] location: package org.apache.hadoop.hive.serde2.objectinspector [javac] import org.apache.hadoop.hive.serde2.objectinspector.MetadataListStructObjectInspector; [javac] ^ [javac] /export/home/mywork/hive/serde/src/java/org/apache/hadoop/hive/serde2/SerDeUtils.java:33: cannot find symbol [javac] symbol : class BooleanObjectInspector [javac] location: package org.apache.hadoop.hive.serde2.objectinspector.primitive [javac] import org.apache.hadoop.hive.serde2.objectinspector.primitive.BooleanObjectInspector; [javac] ^ [javac] /export/home/mywork/hive/serde/src/java/org/apache/hadoop/hive/serde2/SerDeUtils.java:35: cannot find symbol [javac] symbol : class DoubleObjectInspector [javac] location: package org.apache.hadoop.hive.serde2.objectinspector.primitive [javac] import org.apache.hadoop.hive.serde2.objectinspector.primitive.DoubleObjectInspector; [javac] ^ [javac] /export/home/mywork/hive/serde/src/java/org/apache/hadoop/hive/serde2/SerDeUtils.java:36: cannot find symbol [javac] symbol : class FloatObjectInspector [javac] location: package org.apache.hadoop.hive.serde2.objectinspector.primitive [javac] import org.apache.hadoop.hive.serde2.objectinspector.primitive.FloatObjectInspector; [javac] ^ [javac] /export/home/mywork/hive/serde/src/java/org/apache/hadoop/hive/serde2/SerDeUtils.java:39: cannot find symbol [javac] symbol : class ShortObjectInspector [javac] location: package org.apache.hadoop.hive.serde2.objectinspector.primitive [javac] import org.apache.hadoop.hive.serde2.objectinspector.primitive.ShortObjectInspector; [javac] ^ [javac] /export/home/mywork/hive/serde/src/java/org/apache/hadoop/hive/serde2/SerDeUtils.java:40: cannot find symbol [javac] symbol : class StringObjectInspector [javac] location: package org.apache.hadoop.hive.serde2.objectinspector.primitive [javac] import org.apache.hadoop.hive.serde2.objectinspector.primitive.StringObjectInspector; [javac] ^ [javac] /export/home/mywork/hive/serde/src/java/org/apache/hadoop/hive/serde2/binarysortable/BinarySortableSerDe.java:44: cannot find symbol [javac] symbol : class BooleanObjectInspector [javac] location: package org.apache.hadoop.hive.serde2.objectinspector.primitive [javac] import org.apache.hadoop.hive.serde2.objectinspector.primitive.BooleanObjectInspector; [javac] ^ [javac] /export/home/mywork/hive/serde/src/java/org/apache/hadoop/hive/serde2/binarysortable/BinarySortableSerDe.java:46: cannot find symbol [javac] symbol : class DoubleObjectInspector [javac] location: package org.apache.hadoop.hive.serde2.objectinspector.primitive [javac] import org.apache.hadoop.hive.serde2.objectinspector.primitive.DoubleObjectInspector; [javac] ^ [javac] /export/home/mywork/hive/serde/src/java/org/apache/hadoop/hive/serde2/binarysortable/BinarySortableSerDe.java:47: cannot find symbol [javac] symbol : class FloatObjectInspector [javac] location: package org.apache.hadoop.hive.serde2.objectinspector.primitive [javac] import org.apache.hadoop.hive.serde2.objectinspector.primitive.FloatObjectInspector; [javac] ^ [javac] /export/home/mywork/hive/serde/src/java/org/apache/hadoop/hive/serde2/binarysortable/BinarySortableSerDe.java:50: cannot find symbol [javac] symbol : class ShortObjectInspector [javac] location: package org.apache.hadoop.hive.serde2.objectinspector.primitive [javac] import org.apache.hadoop.hive.serde2.objectinspector.primitive.ShortObjectInspector; [javac] ^ [javac] /export/home/mywork/hive/serde/src/java/org/apache/hadoop/hive/serde2/binarysortable/BinarySortableSerDe.java:51: cannot find symbol [javac] symbol : class StringObjectInspector [javac] location: package org.apache.hadoop.hive.serde2.objectinspector.primitive [javac] import org.apache.hadoop.hive.serde2.objectinspector.primitive.StringObjectInspector; [javac] ^ [javac] /export/home/mywork/hive/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazySimpleSerDe.java:43: cannot find symbol [javac] symbol : class PrimitiveObjectInspectorFactory [javac] location: package org.apache.hadoop.hive.serde2.objectinspector.primitive [javac] import org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory; [javac] ^ [javac] /export/home/mywork/hive/serde/src/java/org/apache/hadoop/hive/serde2/columnar/ColumnarSerDe.java:41: cannot find symbol [javac] symbol : class PrimitiveObjectInspectorFactory [javac] location: package org.apache.hadoop.hive.serde2.objectinspector.primitive [javac] import org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory; [javac] ^ [javac] /export/home/mywork/hive/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyStruct.java:26: cannot find symbol [javac] symbol : class LazySimpleStructObjectInspector [javac] location: package org.apache.hadoop.hive.serde2.lazy.objectinspector [javac] import org.apache.hadoop.hive.serde2.lazy.objectinspector.LazySimpleStructObjectInspector; [javac] ^ [javac] /export/home/mywork/hive/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyStruct.java:39: cannot find symbol [javac] symbol: class LazySimpleStructObjectInspector [javac] LazyNonPrimitive<LazySimpleStructObjectInspector> { [javac] ^ [javac] /export/home/mywork/hive/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyStruct.java:68: cannot find symbol [javac] symbol : class LazySimpleStructObjectInspector [javac] location: class org.apache.hadoop.hive.serde2.lazy.LazyStruct [javac] public LazyStruct(LazySimpleStructObjectInspector oi) { [javac] ^ [javac] /export/home/mywork/hive/serde/src/java/org/apache/hadoop/hive/serde2/dynamic_type/DynamicSerDe.java:36: cannot find symbol [javac] symbol : class PrimitiveObjectInspectorFactory [javac] location: package org.apache.hadoop.hive.serde2.objectinspector.primitive [javac] import org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory; [javac] ^ [javac] /export/home/mywork/hive/serde/src/java/org/apache/hadoop/hive/serde2/dynamic_type/DynamicSerDe.java:37: cannot find symbol [javac] symbol : class PrimitiveObjectInspectorUtils [javac] location: package org.apache.hadoop.hive.serde2.objectinspector.primitive [javac] import org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils; [javac] ^ [javac] /export/home/mywork/hive/serde/src/java/org/apache/hadoop/hive/serde2/dynamic_type/DynamicSerDeTypeString.java:23: cannot find symbol [javac] symbol : class StringObjectInspector [javac] location: package org.apache.hadoop.hive.serde2.objectinspector.primitive [javac] import org.apache.hadoop.hive.serde2.objectinspector.primitive.StringObjectInspector; [javac] ^ [javac] /export/home/mywork/hive/serde/src/java/org/apache/hadoop/hive/serde2/dynamic_type/DynamicSerDeTypei16.java:23: cannot find symbol [javac] symbol : class ShortObjectInspector [javac] location: package org.apache.hadoop.hive.serde2.objectinspector.primitive [javac] import org.apache.hadoop.hive.serde2.objectinspector.primitive.ShortObjectInspector; [javac] ^ [javac] /export/home/mywork/hive/serde/src/java/org/apache/hadoop/hive/serde2/dynamic_type/DynamicSerDeTypeDouble.java:23: cannot find symbol [javac] symbol : class DoubleObjectInspector [javac] location: package org.apache.hadoop.hive.serde2.objectinspector.primitive [javac] import org.apache.hadoop.hive.serde2.objectinspector.primitive.DoubleObjectInspector; [javac] ^ [javac] /export/home/mywork/hive/serde/src/java/org/apache/hadoop/hive/serde2/dynamic_type/DynamicSerDeTypeBool.java:23: cannot find symbol [javac] symbol : class BooleanObjectInspector [javac] location: package org.apache.hadoop.hive.serde2.objectinspector.primitive [javac] import org.apache.hadoop.hive.serde2.objectinspector.primitive.BooleanObjectInspector; [javac] ^ [javac] /export/home/mywork/hive/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyBoolean.java:20: package org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive does not exist [javac] import org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive.LazyBooleanObjectInspector; [javac] ^ [javac] /export/home/mywork/hive/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyBoolean.java:37: cannot find symbol [javac] symbol: class LazyBooleanObjectInspector [javac] LazyPrimitive<LazyBooleanObjectInspector, BooleanWritable> { [javac] ^ [javac] /export/home/mywork/hive/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyBoolean.java:39: cannot find symbol [javac] symbol : class LazyBooleanObjectInspector [javac] location: class org.apache.hadoop.hive.serde2.lazy.LazyBoolean [javac] public LazyBoolean(LazyBooleanObjectInspector oi) { [javac] ^ [javac] /export/home/mywork/hive/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyByte.java:21: package org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive does not exist [javac] import org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive.LazyByteObjectInspector; [javac] ^ [javac] /export/home/mywork/hive/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyByte.java:37: cannot find symbol [javac] symbol: class LazyByteObjectInspector [javac] LazyPrimitive<LazyByteObjectInspector, ByteWritable> { [javac] ^ [javac] /export/home/mywork/hive/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyByte.java:39: cannot find symbol [javac] symbol : class LazyByteObjectInspector [javac] location: class org.apache.hadoop.hive.serde2.lazy.LazyByte [javac] public LazyByte(LazyByteObjectInspector oi) { [javac] ^ [javac] /export/home/mywork/hive/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyDouble.java:23: package org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive does not exist [javac] import org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive.LazyDoubleObjectInspector; [javac] ^ [javac] /export/home/mywork/hive/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyDouble.java:31: cannot find symbol [javac] symbol: class LazyDoubleObjectInspector [javac] LazyPrimitive<LazyDoubleObjectInspector, DoubleWritable> { [javac] ^ [javac] /export/home/mywork/hive/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyDouble.java:33: cannot find symbol [javac] symbol : class LazyDoubleObjectInspector [javac] location: class org.apache.hadoop.hive.serde2.lazy.LazyDouble [javac] public LazyDouble(LazyDoubleObjectInspector oi) { [javac] ^ [javac] /export/home/mywork/hive/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyFactory.java:25: cannot find symbol [javac] symbol : class LazyObjectInspectorFactory [javac] location: package org.apache.hadoop.hive.serde2.lazy.objectinspector [javac] import org.apache.hadoop.hive.serde2.lazy.objectinspector.LazyObjectInspectorFactory; [javac] ^ [javac] /export/home/mywork/hive/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyFactory.java:26: cannot find symbol [javac] symbol : class LazySimpleStructObjectInspector [javac] location: package org.apache.hadoop.hive.serde2.lazy.objectinspector [javac] import org.apache.hadoop.hive.serde2.lazy.objectinspector.LazySimpleStructObjectInspector; [javac] ^ [javac] /export/home/mywork/hive/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyFactory.java:27: package org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive does not exist [javac] import org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive.LazyBooleanObjectInspector; [javac] ^ [javac] /export/home/mywork/hive/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyFactory.java:28: package org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive does not exist [javac] import org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive.LazyByteObjectInspector; [javac] ^ [javac] /export/home/mywork/hive/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyFactory.java:29: package org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive does not exist [javac] import org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive.LazyDoubleObjectInspector; [javac] ^ [javac] /export/home/mywork/hive/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyFactory.java:30: package org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive does not exist [javac] import org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive.LazyFloatObjectInspector; [javac] ^ [javac] /export/home/mywork/hive/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyFactory.java:31: package org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive does not exist [javac] import org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive.LazyIntObjectInspector; [javac] ^ [javac] /export/home/mywork/hive/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyFactory.java:32: package org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive does not exist [javac] import org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive.LazyLongObjectInspector; [javac] ^ [javac] /export/home/mywork/hive/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyFactory.java:33: package org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive does not exist [javac] import org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive.LazyPrimitiveObjectInspectorFactory; [javac] ^ [javac] /export/home/mywork/hive/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyFactory.java:34: package org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive does not exist [javac] import org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive.LazyShortObjectInspector; [javac] ^ [javac] /export/home/mywork/hive/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyFactory.java:35: package org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive does not exist [javac] import org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive.LazyStringObjectInspector; [javac] ^ [javac] /export/home/mywork/hive/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyFloat.java:

    Read the article

  • Find Port Number and Domain Name to connect to Hive Table

    - by user1419563
    I am new to Hive, MapReduce and Hadoop. I am using Putty to connect to hive table and access records in the tables. So what I did is- I opened Putty and in the host name I typed- ares-ingest.vip.host.com and then I click Open. And then I entered my username and password and then few commands to get to Hive sql. Below is the list what I did $ bash bash-3.00$ hive Hive history file=/tmp/rjamal/hive_job_log_rjamal_201207010451_1212680168.txt hive> set mapred.job.queue.name=hdmi-technology; hive> select * from table LIMIT 1; So my question is- I was trying to connect to Hive Tables using Squirrel SQL Client, so in that my Connection URL is- jdbc:hive://ares-ingest.vip.host.com:10000/default. So whenever I try to connect with these attributes, I always get Hive: Could not establish connection to ares-ingest.vip.host.com:10000/default: java.net.ConnectException: Connection timed out: connect. It might be possible I am using wrong port number or domain name here. Is there any way from the command prompt I can find out these two things, like what Domain Name and Port Number(where Hive server is running) should I use to connect to Hive table from Squirrel SQL Client. As I know host and port are determined by where the hive server is running

    Read the article

  • Hive metadata permission issue

    - by Chandramohan
    We are getting this error on Hive, while creating a DB / table hive> CREATE TABLE pokes (foo INT, bar STRING); FAILED: Error in metadata: javax.jdo.JDOFatalDataStoreException: Cannot get a connection, pool error Could not create a validated object, cause: A read-only user or a user in a read-only database is not permitted to disable read-only mode on a connection. NestedThrowables: org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool error Could not create a validated object, cause: A read-only user or a user in a read-only database is not permitted to disable read-only mode on a connection. FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask Hive log : org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool error Could not create a validated object, cause: A read-only user or a user in a read-only database is not permitted to disable read-only mode on a connection. at org.datanucleus.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:298) at org.datanucleus.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:601) at org.datanucleus.jdo.JDOPersistenceManagerFactory.createPersistenceManagerFactory(JDOPersistenceManagerFactory.java:286) at org.datanucleus.jdo.JDOPersistenceManagerFactory.getPersistenceManagerFactory(JDOPersistenceManagerFactory.java:182) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at javax.jdo.JDOHelper$16.run(JDOHelper.java:1958) at java.security.AccessController.doPrivileged(Native Method) at javax.jdo.JDOHelper.invoke(JDOHelper.java:1953) at javax.jdo.JDOHelper.invokeGetPersistenceManagerFactoryOnImplementation(JDOHelper.java:1159) at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:803) at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:698) at org.apache.hadoop.hive.metastore.ObjectStore.getPMF(ObjectStore.java:234) at org.apache.hadoop.hive.metastore.ObjectStore.getPersistenceManager(ObjectStore.java:261) at org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:196) at org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:171) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:354) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.executeWithRetry(HiveMetaStore.java:306) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:451) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:232) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.<init>(HiveMetaStore.java:197) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:108) at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:1868) at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:1878) at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:470) ... 15 more Caused by: org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool error Could not create a validated object, cause: A read-only user or a user in a read-only database is not permitted to disable read-only mode on a connection. at org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:114) at org.datanucleus.store.rdbms.ConnectionFactoryImpl$ManagedConnectionImpl.getConnection(ConnectionFactoryImpl.java:521) at org.datanucleus.store.rdbms.RDBMSStoreManager.<init>(RDBMSStoreManager.java:290) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:588) at org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:300) at org.datanucleus.ObjectManagerFactoryImpl.initialiseStoreManager(ObjectManagerFactoryImpl.java:161) at org.datanucleus.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:583) ... 42 more Caused by: java.util.NoSuchElementException: Could not create a validated object, cause: A read-only user or a user in a read-only database is not permitted to disable read-only mode on a connection. at org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:1191) at org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:106) ... 52 more 2011-08-11 18:02:36,964 ERROR ql.Driver (SessionState.java:printError(343)) - FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask

    Read the article

  • configuration required for HIVE to be installed on a node

    - by ????? ????????
    I went through the process of manually installing ambari (not through SSH, because I couldnt get keyless to work) and everything installed OK, except for HIVE and GANGLIA. I got this message: stderr: None stdout: warning: Unrecognised escape sequence ‘\;’ in file /var/lib/ambari-agent/puppet/modules/hdp-hive/manifests/hive/service_check.pp at line 32 warning: Dynamic lookup of $configuration is deprecated. Support will be removed in Puppet 2.8. Use a fully-qualified variable name (e.g., $classname::variable) or parameterized classes. notice: /Stage[1]/Hdp::Snappy::Package/Hdp::Snappy::Package::Ln[32]/Hdp::Exec[hdp::snappy::package::ln 32]/Exec[hdp::snappy::package::ln 32]/returns: executed successfully notice: /Stage[2]/Hdp-hive::Hive::Service_check/File[/tmp/hiveserver2Smoke.sh]/ensure: defined content as ‘{md5}7f1d24221266a2330ec55ba620c015a9' notice: /Stage[2]/Hdp-hive::Hive::Service_check/File[/tmp/hiveserver2.sql]/ensure: defined content as ‘{md5}0c429dc9ae0867b5af74ef85b5530d84' notice: /Stage[2]/Hdp-hcat::Hcat::Service_check/File[/tmp/hcatSmoke.sh]/ensure: defined content as ‘{md5}bae7742f7083db968cb6b2bd208874cb’ notice: /Stage[2]/Hdp-hcat::Hcat::Service_check/Exec[hcatSmoke.sh prepare]/returns: 13/06/25 03:11:56 WARN conf.HiveConf: DEPRECATED: Configuration property hive.metastore.local no longer has any effect. Make sure to provide a valid value for hive.metastore.uris if you are connecting to a remote metastore. notice: /Stage[2]/Hdp-hcat::Hcat::Service_check/Exec[hcatSmoke.sh prepare]/returns: FAILED: SemanticException org.apache.hadoop.hive.ql.parse.SemanticException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient notice: /Stage[2]/Hdp-hcat::Hcat::Service_check/Exec[hcatSmoke.sh prepare]/returns: 13/06/25 03:12:06 WARN conf.HiveConf: DEPRECATED: Configuration property hive.metastore.local no longer has any effect. Make sure to provide a valid value for hive.metastore.uris if you are connecting to a remote metastore. notice: /Stage[2]/Hdp-hcat::Hcat::Service_check/Exec[hcatSmoke.sh prepare]/returns: FAILED: SemanticException [Error 10001]: Table not found hcatsmokeida8c07401_date102513 notice: /Stage[2]/Hdp-hcat::Hcat::Service_check/Exec[hcatSmoke.sh prepare]/returns: 13/06/25 03:12:15 WARN conf.HiveConf: DEPRECATED: Configuration property hive.metastore.local no longer has any effect. Make sure to provide a valid value for hive.metastore.uris if you are connecting to a remote metastore. notice: /Stage[2]/Hdp-hcat::Hcat::Service_check/Exec[hcatSmoke.sh prepare]/returns: FAILED: SemanticException o When i go to the alerts and health checks i’m getting this: ive Metastore status check CRIT for 42 minutes CRITICAL: Error accessing hive-metaserver status [13/06/25 03:44:06 WARN conf.HiveConf: DEPRECATED: Configuration property hive.metastore.local no longer has any effect. What am I doing wrong? I have already tried to do ambari-server reset on the the database without results.

    Read the article

  • Hadoop Hive web interface options

    - by Garethr
    I've been experimenting with Hive for some data mining activities and would like to make it easily available to less command line orientated colleagues. Hive does now ship with a web interface (http://wiki.apache.org/hadoop/Hive/HiveWebInterface) but it's very basic at this stage. My question is does a visually polished and fully featured interface (either desktop or preferably web based) to Hive exist yet? Are their any open source efforts outside the Hive project working on this?

    Read the article

  • Using Hive in a maven project

    - by Jason
    I have a project that I am migrating from ant to maven. The project makes use of a lightly-customized Hive build. I figured I would just import this build into our internal maven repo and list it as a dependency in the project's pom file. The problem I'm running into is that the Hive build just generates a bunch of jars in build/dist/lib. Some of these are the core Hive jars themselves and some are jars that Hive depends on. What's the best way to deal with these? Should I put all the core hive jars into our internal repo and just deal with undocumented dependencies in the new project's pom file? Or just jar up everything as a jar of jars and deploy that to the repo? Would that approach even work? Kind of a maven newbie still, thanks for any help.

    Read the article

  • Big Data – Data Mining with Hive – What is Hive? – What is HiveQL (HQL)? – Day 15 of 21

    - by Pinal Dave
    In yesterday’s blog post we learned the importance of the operational database in Big Data Story. In this article we will understand what is Hive and HQL in Big Data Story. Yahoo started working on PIG (we will understand that in the next blog post) for their application deployment on Hadoop. The goal of Yahoo to manage their unstructured data. Similarly Facebook started deploying their warehouse solutions on Hadoop which has resulted in HIVE. The reason for going with HIVE is because the traditional warehousing solutions are getting very expensive. What is HIVE? Hive is a datawarehouseing infrastructure for Hadoop. The primary responsibility is to provide data summarization, query and analysis. It  supports analysis of large datasets stored in Hadoop’s HDFS as well as on the Amazon S3 filesystem. The best part of HIVE is that it supports SQL-Like access to structured data which is known as HiveQL (or HQL) as well as big data analysis with the help of MapReduce. Hive is not built to get a quick response to queries but it it is built for data mining applications. Data mining applications can take from several minutes to several hours to analysis the data and HIVE is primarily used there. HIVE Organization The data are organized in three different formats in HIVE. Tables: They are very similar to RDBMS tables and contains rows and tables. Hive is just layered over the Hadoop File System (HDFS), hence tables are directly mapped to directories of the filesystems. It also supports tables stored in other native file systems. Partitions: Hive tables can have more than one partition. They are mapped to subdirectories and file systems as well. Buckets: In Hive data may be divided into buckets. Buckets are stored as files in partition in the underlying file system. Hive also has metastore which stores all the metadata. It is a relational database containing various information related to Hive Schema (column types, owners, key-value data, statistics etc.). We can use MySQL database over here. What is HiveSQL (HQL)? Hive query language provides the basic SQL like operations. Here are few of the tasks which HQL can do easily. Create and manage tables and partitions Support various Relational, Arithmetic and Logical Operators Evaluate functions Download the contents of a table to a local directory or result of queries to HDFS directory Here is the example of the HQL Query: SELECT upper(name), salesprice FROM sales; SELECT category, count(1) FROM products GROUP BY category; When you look at the above query, you can see they are very similar to SQL like queries. Tomorrow In tomorrow’s blog post we will discuss about very important components of the Big Data Ecosystem – Pig. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

    Read the article

  • Hive Based Registry in Flash

    - by Psychic
    To start with I'll say I've read the post here and I'm still having trouble. I'm trying to create a CE6 image with a hive-based registry that actually stores results through a reboot. I've ticked the hive settings in the catalog items. In common.reg, I've set the location of the hive ([HKEY_LOCAL_MACHINE\init\BootVars] "SystemHive") to "Hard Drive\Registry" (Note: the flash shows up as a device called "Hard Drive") In common.reg, I've set "Flags"=dword:3 in the same place to get the device manager loaded along with the storage manager I've verified that these settings are wrapped in "; HIVE BOOT SECTION" This is where it starts to fall over. It all compiles fine, but on the target system, when it boots, I get: A directory, called "Hard Disk" where a registry is put A device, name called "Hard Disk2" where the permanent flash is Any changes made to the registry are lost on a reboot What am I still missing? Why is the registry not being stored on the flash? Strangly, if I create a random file/directory in the registry directory, it is still there after a reboot, so even though this directory isn't on the other partition (where I tried to put it), it does appear to be permanent. If it is permanent, why don't registry settings save (ie Ethernet adapter IP addresses?) I'm not using any specific profiles, so I'm at a loss as to what the last step is to make this hive registry a permanent store.

    Read the article

  • How does Hive compare to HBase?

    - by mrhahn
    I'm interested in finding out how the recently-released (http://mirror.facebook.com/facebook/hive/hadoop-0.17/) Hive compares to HBase in terms of performance. The SQL-like interface used by Hive is very much preferable to the HBase API we have implemented.

    Read the article

  • Hive performance increase

    - by Sagar Nikam
    I am dealing with a database (2.5 GB) having some tables only 40 row to some having 9 million rows data. when I am doing any query for large table it takes more time. I want results in less time small query on table which have 90 rows only-- hive> select count(*) from cidade; Time taken: 50.172 seconds hdfs-site.xml <configuration> <property> <name>dfs.replication</name> <value>3</value> <description>Default block replication. The actual number of replications can be specified when the file is created. The default is used if replication is not specified in create time. </description> </property> <property> <name>dfs.block.size</name> <value>131072</value> <description>Default block replication. The actual number of replications can be specified when the file is created. The default is used if replication is not specified in create time. </description> </property> </configuration> does these setting affects performance of hive? dfs.replication=3 dfs.block.size=131072 can i set it from hive prompt as hive>set dfs.replication=5 Is this value remains for a perticular session only ? or Is it better to change it in .xml file ?

    Read the article

  • how to setup hive on a single node?

    - by Harman
    I successfully setup hadoop on ubuntu 10.04 on a single node by going through the steps mentioned in Michael Noll's tutorial ( Running Hadoop On Ubuntu Linux (Single-Node Cluster) ). Now, I'm trying to setup hive on the same machine but I'm stuck as of what to do after I decompress the hive-0.8.1-bin.tar.gz and move it to /usr/local/hive. Any help would be appreciated but as I'm new to Linux, it would be very helpful if someone could help me step-by-step.

    Read the article

  • Hive NR map progress inconsistent and regurlarly restart from 0%

    - by user92471
    I have a Yarn MR (with two ec2 instances to mapreduce) job on a dataset of approximately a thousand avro records, and the map phase is behaving erratically. See the progress below. Of course i checked the logs on resourcemanager and nodemanagers and saw nothing suspicious, but these logs are too verbose What is going on there ? hive> select * from nikon where qs_cs_s_aid='VIEW' limit 10; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_1352281315350_0020, Tracking URL = http://blabla.ec2.internal:8088/proxy/application_1352281315350_0020/ Kill Command = /usr/lib/hadoop/bin/hadoop job -Dmapred.job.tracker=blabla.com:8032 -kill job_1352281315350_0020 Hadoop job information for Stage-1: number of mappers: 4; number of reducers: 0 2012-11-07 11:14:40,976 Stage-1 map = 0%, reduce = 0% 2012-11-07 11:15:06,136 Stage-1 map = 1%, reduce = 0%, Cumulative CPU 10.38 sec 2012-11-07 11:15:07,253 Stage-1 map = 1%, reduce = 0%, Cumulative CPU 12.18 sec 2012-11-07 11:15:08,371 Stage-1 map = 1%, reduce = 0%, Cumulative CPU 12.18 sec 2012-11-07 11:15:09,491 Stage-1 map = 1%, reduce = 0%, Cumulative CPU 12.18 sec 2012-11-07 11:15:10,643 Stage-1 map = 2%, reduce = 0%, Cumulative CPU 15.42 sec (...) 2012-11-07 11:15:35,441 Stage-1 map = 28%, reduce = 0%, Cumulative CPU 37.77 sec 2012-11-07 11:15:36,486 Stage-1 map = 28%, reduce = 0%, Cumulative CPU 37.77 sec here restart at 16% ? 2012-11-07 11:15:37,692 Stage-1 map = 16%, reduce = 0%, Cumulative CPU 21.15 sec 2012-11-07 11:15:38,815 Stage-1 map = 16%, reduce = 0%, Cumulative CPU 21.15 sec 2012-11-07 11:15:39,865 Stage-1 map = 16%, reduce = 0%, Cumulative CPU 21.15 sec 2012-11-07 11:15:41,064 Stage-1 map = 18%, reduce = 0%, Cumulative CPU 22.4 sec 2012-11-07 11:15:42,181 Stage-1 map = 18%, reduce = 0%, Cumulative CPU 22.4 sec 2012-11-07 11:15:43,299 Stage-1 map = 18%, reduce = 0%, Cumulative CPU 22.4 sec here restart at 0% ? 2012-11-07 11:15:44,418 Stage-1 map = 0%, reduce = 0% 2012-11-07 11:16:02,076 Stage-1 map = 1%, reduce = 0%, Cumulative CPU 6.86 sec 2012-11-07 11:16:03,193 Stage-1 map = 1%, reduce = 0%, Cumulative CPU 6.86 sec 2012-11-07 11:16:04,259 Stage-1 map = 2%, reduce = 0%, Cumulative CPU 8.45 sec (...) 2012-11-07 11:16:31,291 Stage-1 map = 22%, reduce = 0%, Cumulative CPU 35.34 sec 2012-11-07 11:16:32,414 Stage-1 map = 26%, reduce = 0%, Cumulative CPU 37.93 sec here restart at 11% ? 2012-11-07 11:16:33,459 Stage-1 map = 11%, reduce = 0%, Cumulative CPU 19.53 sec 2012-11-07 11:16:34,507 Stage-1 map = 11%, reduce = 0%, Cumulative CPU 19.53 sec 2012-11-07 11:16:35,731 Stage-1 map = 13%, reduce = 0%, Cumulative CPU 21.47 sec (...) 2012-11-07 11:16:46,839 Stage-1 map = 17%, reduce = 0%, Cumulative CPU 24.14 sec here restart at 0% ? 2012-11-07 11:16:47,939 Stage-1 map = 0%, reduce = 0% 2012-11-07 11:16:56,653 Stage-1 map = 1%, reduce = 0%, Cumulative CPU 7.54 sec 2012-11-07 11:16:57,814 Stage-1 map = 1%, reduce = 0%, Cumulative CPU 7.54 sec (...) Needless to say the job crashes after some time with an Error: java.io.IOException: java.io.IOException: java.lang.ArrayIndexOutOfBoundsException: -56

    Read the article

  • SQLAuthority News – Download Whitepaper – SQL Server Analysis Services to Hive

    - by pinaldave
    The SQL Server Analysis Service is a very interesting subject and I always have enjoyed learning about it. You can read my earlier article over here. Big Data is my new interest and I have been exploring it recently. During this weekend this blog post caught my attention and I enjoyed reading it. Big Data is the next big thing. The growth is predicted to be 60% per year till 2016. There is no single solution to the growing need of the big data available in the market right now as well there is no one solution in the business intelligence eco-system available as well. However, the need of the solution is ever increasing. I am personally Klout user. You can see my Klout profile over. I do understand what Klout is trying to achieve – a single place to measure the influence of the person. However, it works a bit mysteriously. There are plenty of social media available currently in the internet world. The biggest problem all the social media faces is that everybody opens an account but hardly people logs back in. To overcome this issue and have returned visitors Klout has come up with the system where visitors can give 5/10 K+ to other users in a particular area. Looking at all the activities Klout is doing it is indeed big consumer of the Big Data as well it is early adopter of the big data and Hadoop based system.  Klout has to 1 trillion rows of data to be analyzed as well have nearly thousand terabyte warehouse. Hive the language used for Big Data supports Ad-Hoc Queries using HiveQL there are always better solutions. The alternate solution would be using SQL Server Analysis Services (SSAS) along with HiveQL. As there is no direct method to achieve there are few common workarounds already in place. A new ODBC driver from Klout has broken through the limitation and SQL Server Relation Engine can be used as an intermediate stage before SSAS. In this white paper the same solutions have been discussed in the depth. The white paper discusses following important concepts. The Klout Big Data solution Big Data Analytics based on Analysis Services Hadoop/Hive and Analysis Services integration Limitations of direct connectivity Pass-through queries to linked servers Best practices and lessons learned This white paper discussed all the important concepts which have enabled Klout to go go to the next level with all the offerings as well helped efficiency by offering a few out of the box solutions. I personally enjoy reading this white paper and I encourage all of you to do so. SQL Server Analysis Services to Hive Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, SQL White Papers, T SQL, Technology

    Read the article

  • How to compact a registry hive?

    - by SLaks
    I'm using a (non-administrator) roaming profile, with a size limit of 4MB. As you can imagine, it is extremely difficult to stay within that size limit. I've noticed that NTUser.dat, which holds my HKEY_CURRENT_USER hive, is 2560KB, single-handedly using more than half of that limit. Is there any way to shrink the hive without administrator privileges? I don't mind losing any settings or preferences stored in it.

    Read the article

  • SYSTEM hive of the register is causing computer to BSOD on boot

    - by FernandoSBS
    After some work on my computer, perhaps some updates installs, it didn't boot anymore. BSOD on windows logo loading. After some research I've found out that if I replace the SYSTEM hive of the register with the last backup it boots ok, but of course goes back to an old stage of machine setup. So my question is: where does register stores hardware and/or updates/software that loads in the machine boot, so that I can check which program is producing that BSOD and disable it? System is Win 7 64 bits SP1

    Read the article

  • New Feature in ODI 11.1.1.6: ODI for Big Data

    - by Julien Testut
    Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Calibri","sans-serif"; mso-bidi-font-family:"Times New Roman";} By Ananth Tirupattur Starting with Oracle Data Integrator 11.1.1.6.0, ODI is offering a solution to process Big Data. This post provides an overview of this feature. With all the buzz around Big Data and before getting into the details of ODI for Big Data, I will provide a brief introduction to Big Data and Oracle Solution for Big Data. So, what is Big Data? Big data includes: structured data (this includes data from relation data stores, xml data stores), semi-structured data (this includes data from weblogs) unstructured data (this includes data from text blob, images) Traditionally, business decisions are based on the information gathered from transactional data. For example, transactional Data from CRM applications is fed to a decision system for analysis and decision making. Products such as ODI play a key role in enabling decision systems. However, with the emergence of massive amounts of semi-structured and unstructured data it is important for decision system to include them in the analysis to achieve better decision making capability. While there is an abundance of opportunities for business for gaining competitive advantages, process of Big Data has challenges. The challenges of processing Big Data include: Volume of data Velocity of data - The high Rate at which data is generated Variety of data In order to address these challenges and convert them into opportunities, we would need an appropriate framework, platform and the right set of tools. Hadoop is an open source framework which is highly scalable, fault tolerant system, for storage and processing large amounts of data. Hadoop provides 2 key services, distributed and reliable storage called Hadoop Distributed File System or HDFS and a framework for parallel data processing called Map-Reduce. Innovations in Hadoop and its related technology continue to rapidly evolve, hence therefore, it is highly recommended to follow information on the web to keep up with latest information. Oracle's vision is to provide a comprehensive solution to address the challenges faced by Big Data. Oracle is providing the necessary Hardware, software and tools for processing Big Data Oracle solution includes: Big Data Appliance Oracle NoSQL Database Cloudera distribution for Hadoop Oracle R Enterprise- R is a statistical package which is very popular among data scientists. ODI solution for Big Data Oracle Loader for Hadoop for loading data from Hadoop to Oracle. Further details can be found here: http://www.oracle.com/us/products/database/big-data-appliance/overview/index.html ODI Solution for Big Data: ODI’s goal is to minimize the need to understand the complexity of Hadoop framework and simplify the adoption of processing Big Data seamlessly in an enterprise. ODI is providing the capabilities for an integrated architecture for processing Big Data. This includes capability to load data in to Hadoop, process data in Hadoop and load data from Hadoop into Oracle. ODI is expanding its support for Big Data by providing the following out of the box Knowledge Modules (KMs). IKM File to Hive (LOAD DATA).Load unstructured data from File (Local file system or HDFS ) into Hive IKM Hive Control AppendTransform and validate structured data on Hive IKM Hive TransformTransform unstructured data on Hive IKM File/Hive to Oracle (OLH)Load processed data in Hive to Oracle RKM HiveReverse engineer Hive tables to generate models Using the Loading KM you can map files (local and HDFS files) to the corresponding Hive tables. For example, you can map weblog files categorized by date into a corresponding partitioned Hive table schema. Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Calibri","sans-serif"; mso-bidi-font-family:"Times New Roman";} Using the Hive control Append KM you can validate and transform data in Hive. In the below example, two source Hive tables are joined and mapped to a target Hive table. Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Calibri","sans-serif"; mso-bidi-font-family:"Times New Roman";} The Hive Transform KM facilitates processing of semi-structured data in Hive. In the below example, the data from weblog is processed using a Perl script and mapped to target Hive table. Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Calibri","sans-serif"; mso-bidi-font-family:"Times New Roman";} Using the Oracle Loader for Hadoop (OLH) KM you can load data from Hive table or HDFS to a corresponding table in Oracle. OLH is available as a standalone product. ODI greatly enhances OLH capability by generating the configuration and mapping files for OLH based on the configuration provided in the interface and KM options. ODI seamlessly invokes OLH when executing the scenario. In the below example, a HDFS file is mapped to a table in Oracle. Development and Deployment:The following diagram illustrates the development and deployment of ODI solution for Big Data. Using the ODI Studio on your development machine create and develop ODI solution for processing Big Data by connecting to a MySQL DB or Oracle database on a BDA machine or Hadoop cluster. Schedule the ODI scenarios to be executed on the ODI agent deployed on the BDA machine or Hadoop cluster. ODI Solution for Big Data provides several exciting new capabilities to facilitate the adoption of Big Data in an enterprise. You can find more information about the Oracle Big Data connectors on OTN. You can find an overview of all the new features introduced in ODI 11.1.1.6 in the following document: ODI 11.1.1.6 New Features Overview

    Read the article

  • Adding resources to solution explorer in experimental hive

    - by Brian Webb
    Hi, I'm currently working on a project using DSL tools in Visual Studio 2008. Is there a way to automatically add a resource into the solution explorer of the experimental hive at runtime? I'm creating new diagrams based on what is on screen, and saving them into the directory the project is stored in. I would like to know if there is a way to get them to automatically get added to the solution explorer? (I don't want to have to drag the files in manually each time)

    Read the article

  • Free data warehouse - Infobright, Hadoop/Hive or what ?

    - by peperg
    I need to store large amount of small data objects (millions of rows per month). Once they're saved they wont change. I need to : store them securely use them to analysis (mostly time-oriented) retrieve some raw data occasionally It would be nice if it could be used with JasperReports or BIRT My first shot was Infobright Community - just a column-oriented, read-only storing mechanism for MySQL On the other hand, people says that NoSQL approach could be better. Hadoop+Hive looks promissing, but the documentation looks poor and the version number is less than 1.0 . I heard about Hypertable, Pentaho, MongoDB .... Do you have any recommendations ? (Yes, I found some topics here, but it was year or two ago)

    Read the article

  • Big Data Matters with ODI12c

    - by Madhu Nair
    contributed by Mike Eisterer On October 17th, 2013, Oracle announced the release of Oracle Data Integrator 12c (ODI12c).  This release signifies improvements to Oracle’s Data Integration portfolio of solutions, particularly Big Data integration. Why Big Data = Big Business Organizations are gaining greater insights and actionability through increased storage, processing and analytical benefits offered by Big Data solutions.  New technologies and frameworks like HDFS, NoSQL, Hive and MapReduce support these benefits now. As further data is collected, analytical requirements increase and the complexity of managing transformations and aggregations of data compounds and organizations are in need for scalable Data Integration solutions. ODI12c provides enterprise solutions for the movement, translation and transformation of information and data heterogeneously and in Big Data Environments through: The ability for existing ODI and SQL developers to leverage new Big Data technologies. A metadata focused approach for cataloging, defining and reusing Big Data technologies, mappings and process executions. Integration between many heterogeneous environments and technologies such as HDFS and Hive. Generation of Hive Query Language. Working with Big Data using Knowledge Modules  ODI12c provides developers with the ability to define sources and targets and visually develop mappings to effect the movement and transformation of data.  As the mappings are created, ODI12c leverages a rich library of prebuilt integrations, known as Knowledge Modules (KMs).  These KMs are contextual to the technologies and platforms to be integrated.  Steps and actions needed to manage the data integration are pre-built and configured within the KMs.  The Oracle Data Integrator Application Adapter for Hadoop provides a series of KMs, specifically designed to integrate with Big Data Technologies.  The Big Data KMs include: Check Knowledge Module Reverse Engineer Knowledge Module Hive Transform Knowledge Module Hive Control Append Knowledge Module File to Hive (LOAD DATA) Knowledge Module File-Hive to Oracle (OLH-OSCH) Knowledge Module  Nothing to beat an Example: To demonstrate the use of the KMs which are part of the ODI Application Adapter for Hadoop, a mapping may be defined to move data between files and Hive targets.  The mapping is defined by dragging the source and target into the mapping, performing the attribute (column) mapping (see Figure 1) and then selecting the KM which will govern the process.  In this mapping example, movie data is being moved from an HDFS source into a Hive table.  Some of the attributes, such as “CUSTID to custid”, have been mapped over. Figure 1  Defining the Mapping Before the proper KM can be assigned to define the technology for the mapping, it needs to be added to the ODI project.  The Big Data KMs have been made available to the project through the KM import process.   Generally, this is done prior to defining the mapping. Figure 2  Importing the Big Data Knowledge Modules Following the import, the KMs are available in the Designer Navigator. v\:* {behavior:url(#default#VML);} o\:* {behavior:url(#default#VML);} w\:* {behavior:url(#default#VML);} .shape {behavior:url(#default#VML);} Normal 0 false false false EN-US ZH-TW X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Calibri","sans-serif"; mso-bidi-font-family:"Times New Roman";} Figure 3  The Project View in Designer, Showing Installed IKMs Once the KM is imported, it may be assigned to the mapping target.  This is done by selecting the Physical View of the mapping and examining the Properties of the Target.  In this case MOVIAPP_LOG_STAGE is the target of our mapping. Figure 4  Physical View of the Mapping and Assigning the Big Data Knowledge Module to the Target Alternative KMs may have been selected as well, providing flexibility and abstracting the logical mapping from the physical implementation.  Our mapping may be applied to other technologies as well. The mapping is now complete and is ready to run.  We will see more in a future blog about running a mapping to load Hive. To complete the quick ODI for Big Data Overview, let us take a closer look at what the IKM File to Hive is doing for us.  ODI provides differentiated capabilities by defining the process and steps which normally would have to be manually developed, tested and implemented into the KM.  As shown in figure 5, the KM is preparing the Hive session, managing the Hive tables, performing the initial load from HDFS and then performing the insert into Hive.  HDFS and Hive options are selected graphically, as shown in the properties in Figure 4. Figure 5  Process and Steps Managed by the KM What’s Next Big Data being the shape shifting business challenge it is is fast evolving into the deciding factor between market leaders and others. Now that an introduction to ODI and Big Data has been provided, look for additional blogs coming soon using the Knowledge Modules which make up the Oracle Data Integrator Application Adapter for Hadoop: Importing Big Data Metadata into ODI, Testing Data Stores and Loading Hive Targets Generating Transformations using Hive Query language Loading Oracle from Hadoop Sources For more information now, please visit the Oracle Data Integrator Application Adapter for Hadoop web site, http://www.oracle.com/us/products/middleware/data-integration/hadoop/overview/index.html Do not forget to tune in to the ODI12c Executive Launch webcast on the 12th to hear more about ODI12c and GG12c. Normal 0 false false false EN-US ZH-TW X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Calibri","sans-serif"; mso-bidi-font-family:"Times New Roman";}

    Read the article

  • Building Simple Workflows in Oozie

    - by dan.mcclary
    Introduction More often than not, data doesn't come packaged exactly as we'd like it for analysis. Transformation, match-merge operations, and a host of data munging tasks are usually needed before we can extract insights from our Big Data sources. Few people find data munging exciting, but it has to be done. Once we've suffered that boredom, we should take steps to automate the process. We want codify our work into repeatable units and create workflows which we can leverage over and over again without having to write new code. In this article, we'll look at how to use Oozie to create a workflow for the parallel machine learning task I described on Cloudera's site. Hive Actions: Prepping for Pig In my parallel machine learning article, I use data from the National Climatic Data Center to build weather models on a state-by-state basis. NCDC makes the data freely available as gzipped files of day-over-day observations stretching from the 1930s to today. In reading that post, one might get the impression that the data came in a handy, ready-to-model files with convenient delimiters. The truth of it is that I need to perform some parsing and projection on the dataset before it can be modeled. If I get more observations, I'll want to retrain and test those models, which will require more parsing and projection. This is a good opportunity to start building up a workflow with Oozie. I store the data from the NCDC in HDFS and create an external Hive table partitioned by year. This gives me flexibility of Hive's query language when I want it, but let's me put the dataset in a directory of my choosing in case I want to treat the same data with Pig or MapReduce code. CREATE EXTERNAL TABLE IF NOT EXISTS historic_weather(column 1, column2) PARTITIONED BY (yr string) STORED AS ... LOCATION '/user/oracle/weather/historic'; As new weather data comes in from NCDC, I'll need to add partitions to my table. That's an action I should put in the workflow. Similarly, the weather data requires parsing in order to be useful as a set of columns. Because of their long history, the weather data is broken up into fields of specific byte lengths: x bytes for the station ID, y bytes for the dew point, and so on. The delimiting is consistent from year to year, so writing SerDe or a parser for transformation is simple. Once that's done, I want to select columns on which to train, classify certain features, and place the training data in an HDFS directory for my Pig script to access. ALTER TABLE historic_weather ADD IF NOT EXISTS PARTITION (yr='2010') LOCATION '/user/oracle/weather/historic/yr=2011'; INSERT OVERWRITE DIRECTORY '/user/oracle/weather/cleaned_history' SELECT w.stn, w.wban, w.weather_year, w.weather_month, w.weather_day, w.temp, w.dewp, w.weather FROM ( FROM historic_weather SELECT TRANSFORM(...) USING '/path/to/hive/filters/ncdc_parser.py' as stn, wban, weather_year, weather_month, weather_day, temp, dewp, weather ) w; Since I'm going to prepare training directories with at least the same frequency that I add partitions, I should also add that to my workflow. Oozie is going to invoke these Hive actions using what's somewhat obviously referred to as a Hive action. Hive actions amount to Oozie running a script file containing our query language statements, so we can place them in a file called weather_train.hql. Starting Our Workflow Oozie offers two types of jobs: workflows and coordinator jobs. Workflows are straightforward: they define a set of actions to perform as a sequence or directed acyclic graph. Coordinator jobs can take all the same actions of Workflow jobs, but they can be automatically started either periodically or when new data arrives in a specified location. To keep things simple we'll make a workflow job; coordinator jobs simply require another XML file for scheduling. The bare minimum for workflow XML defines a name, a starting point, and an end point: <workflow-app name="WeatherMan" xmlns="uri:oozie:workflow:0.1"> <start to="ParseNCDCData"/> <end name="end"/> </workflow-app> To this we need to add an action, and within that we'll specify the hive parameters Also, keep in mind that actions require <ok> and <error> tags to direct the next action on success or failure. <action name="ParseNCDCData"> <hive xmlns="uri:oozie:hive-action:0.2"> <job-tracker>localhost:8021</job-tracker> <name-node>localhost:8020</name-node> <configuration> <property> <name>oozie.hive.defaults</name> <value>/user/oracle/weather_ooze/hive-default.xml</value> </property> </configuration> <script>ncdc_parse.hql</script> </hive> <ok to="WeatherMan"/> <error to="end"/> </action> There are a couple of things to note here: I have to give the FQDN (or IP) and port of my JobTracker and NameNode. I have to include a hive-default.xml file. I have to include a script file. The hive-default.xml and script file must be stored in HDFS That last point is particularly important. Oozie doesn't make assumptions about where a given workflow is being run. You might submit workflows against different clusters, or have different hive-defaults.xml on different clusters (e.g. MySQL or Postgres-backed metastores). A quick way to ensure that all the assets end up in the right place in HDFS is just to make a working directory locally, build your workflow.xml in it, and copy the assets you'll need to it as you add actions to workflow.xml. At this point, our local directory should contain: workflow.xml hive-defaults.xml (make sure this file contains your metastore connection data) ncdc_parse.hql Adding Pig to the Ooze Adding our Pig script as an action is slightly simpler from an XML standpoint. All we do is add an action to workflow.xml as follows: <action name="WeatherMan"> <pig> <job-tracker>localhost:8021</job-tracker> <name-node>localhost:8020</name-node> <script>weather_train.pig</script> </pig> <ok to="end"/> <error to="end"/> </action> Once we've done this, we'll copy weather_train.pig to our working directory. However, there's a bit of a "gotcha" here. My pig script registers the Weka Jar and a chunk of jython. If those aren't also in HDFS, our action will fail from the outset -- but where do we put them? The Jython script goes into the working directory at the same level as the pig script, because pig attempts to load Jython files in the directory from which the script executes. However, that's not where our Weka jar goes. While Oozie doesn't assume much, it does make an assumption about the Pig classpath. Anything under working_directory/lib gets automatically added to the Pig classpath and no longer requires a REGISTER statement in the script. Anything that uses a REGISTER statement cannot be in the working_directory/lib directory. Instead, it needs to be in a different HDFS directory and attached to the pig action with an <archive> tag. Yes, that's as confusing as you think it is. You can get the exact rules for adding Jars to the distributed cache from Oozie's Pig Cookbook. Making the Workflow Work We've got a workflow defined and have collected all the components we'll need to run. But we can't run anything yet, because we still have to define some properties about the job and submit it to Oozie. We need to start with the job properties, as this is essentially the "request" we'll submit to the Oozie server. In the same working directory, we'll make a file called job.properties as follows: nameNode=hdfs://localhost:8020 jobTracker=localhost:8021 queueName=default weatherRoot=weather_ooze mapreduce.jobtracker.kerberos.principal=foo dfs.namenode.kerberos.principal=foo oozie.libpath=${nameNode}/user/oozie/share/lib oozie.wf.application.path=${nameNode}/user/${user.name}/${weatherRoot} outputDir=weather-ooze While some of the pieces of the properties file are familiar (e.g., JobTracker address), others take a bit of explaining. The first is weatherRoot: this is essentially an environment variable for the script (as are jobTracker and queueName). We're simply using them to simplify the directives for the Oozie job. The oozie.libpath pieces is extremely important. This is a directory in HDFS which holds Oozie's shared libraries: a collection of Jars necessary for invoking Hive, Pig, and other actions. It's a good idea to make sure this has been installed and copied up to HDFS. The last two lines are straightforward: run the application defined by workflow.xml at the application path listed and write the output to the output directory. We're finally ready to submit our job! After all that work we only need to do a few more things: Validate our workflow.xml Copy our working directory to HDFS Submit our job to the Oozie server Run our workflow Let's do them in order. First validate the workflow: oozie validate workflow.xml Next, copy the working directory up to HDFS: hadoop fs -put working_dir /user/oracle/working_dir Now we submit the job to the Oozie server. We need to ensure that we've got the correct URL for the Oozie server, and we need to specify our job.properties file as an argument. oozie job -oozie http://url.to.oozie.server:port_number/ -config /path/to/working_dir/job.properties -submit We've submitted the job, but we don't see any activity on the JobTracker? All I got was this funny bit of output: 14-20120525161321-oozie-oracle This is because submitting a job to Oozie creates an entry for the job and places it in PREP status. What we got back, in essence, is a ticket for our workflow to ride the Oozie train. We're responsible for redeeming our ticket and running the job. oozie -oozie http://url.to.oozie.server:port_number/ -start 14-20120525161321-oozie-oracle Of course, if we really want to run the job from the outset, we can change the "-submit" argument above to "-run." This will prep and run the workflow immediately. Takeaway So, there you have it: the somewhat laborious process of building an Oozie workflow. It's a bit tedious the first time out, but it does present a pair of real benefits to those of us who spend a great deal of time data munging. First, when new data arrives that requires the same processing, we already have the workflow defined and ready to run. Second, as we build up a set of useful action definitions over time, creating new workflows becomes quicker and quicker.

    Read the article

  • Using Hadoop, are my reducers guaranteed to get all the records with the same key?

    - by samg
    I'm running a hadoop job (using hive actually) which is supposed to uniq lines in a lot of text file. More specifically it chooses the most recently timestamped record for each key in the reduce step. Does hadoop guarantee that every record with the same key, output by the map step, will go to a single reducer, even if there are many reducers running across a cluster? I'm worried that the mapper output might be split after the shuffle happens, in the middle of a set of records with the same key.

    Read the article

  • Oracle Big Data Software Downloads

    - by Mike.Hallett(at)Oracle-BI&EPM
    Companies have been making business decisions for decades based on transactional data stored in relational databases. Beyond that critical data, is a potential treasure trove of less structured data: weblogs, social media, email, sensors, and photographs that can be mined for useful information. Oracle offers a broad integrated portfolio of products to help you acquire and organize these diverse data sources and analyze them alongside your existing data to find new insights and capitalize on hidden relationships. Oracle Big Data Connectors Downloads here, includes: Oracle SQL Connector for Hadoop Distributed File System Release 2.1.0 Oracle Loader for Hadoop Release 2.1.0 Oracle Data Integrator Companion 11g Oracle R Connector for Hadoop v 2.1 Oracle Big Data Documentation The Oracle Big Data solution offers an integrated portfolio of products to help you organize and analyze your diverse data sources alongside your existing data to find new insights and capitalize on hidden relationships. Oracle Big Data, Release 2.2.0 - E41604_01 zip (27.4 MB) Integrated Software and Big Data Connectors User's Guide HTML PDF Oracle Data Integrator (ODI) Application Adapter for Hadoop Apache Hadoop is designed to handle and process data that is typically from data sources that are non-relational and data volumes that are beyond what is handled by relational databases. Typical processing in Hadoop includes data validation and transformations that are programmed as MapReduce jobs. Designing and implementing a MapReduce job usually requires expert programming knowledge. However, when you use Oracle Data Integrator with the Application Adapter for Hadoop, you do not need to write MapReduce jobs. Oracle Data Integrator uses Hive and the Hive Query Language (HiveQL), a SQL-like language for implementing MapReduce jobs. Employing familiar and easy-to-use tools and pre-configured knowledge modules (KMs), the application adapter provides the following capabilities: Loading data into Hadoop from the local file system and HDFS Performing validation and transformation of data within Hadoop Loading processed data from Hadoop to an Oracle database for further processing and generating reports Oracle Database Loader for Hadoop Oracle Loader for Hadoop is an efficient and high-performance loader for fast movement of data from a Hadoop cluster into a table in an Oracle database. It pre-partitions the data if necessary and transforms it into a database-ready format. Oracle Loader for Hadoop is a Java MapReduce application that balances the data across reducers to help maximize performance. Oracle R Connector for Hadoop Oracle R Connector for Hadoop is a collection of R packages that provide: Interfaces to work with Hive tables, the Apache Hadoop compute infrastructure, the local R environment, and Oracle database tables Predictive analytic techniques, written in R or Java as Hadoop MapReduce jobs, that can be applied to data in HDFS files You install and load this package as you would any other R package. Using simple R functions, you can perform tasks such as: Access and transform HDFS data using a Hive-enabled transparency layer Use the R language for writing mappers and reducers Copy data between R memory, the local file system, HDFS, Hive, and Oracle databases Schedule R programs to execute as Hadoop MapReduce jobs and return the results to any of those locations Oracle SQL Connector for Hadoop Distributed File System Using Oracle SQL Connector for HDFS, you can use an Oracle Database to access and analyze data residing in Hadoop in these formats: Data Pump files in HDFS Delimited text files in HDFS Hive tables For other file formats, such as JSON files, you can stage the input in Hive tables before using Oracle SQL Connector for HDFS. Oracle SQL Connector for HDFS uses external tables to provide Oracle Database with read access to Hive tables, and to delimited text files and Data Pump files in HDFS. Related Documentation Cloudera's Distribution Including Apache Hadoop Library HTML Oracle R Enterprise HTML Oracle NoSQL Database HTML Recent Blog Posts Big Data Appliance vs. DIY Price Comparison Big Data: Architecture Overview Big Data: Achieve the Impossible in Real-Time Big Data: Vertical Behavioral Analytics Big Data: In-Memory MapReduce Flume and Hive for Log Analytics Building Workflows in Oozie

    Read the article

  • Accessing or Resetting Permissions of a Mounted Registry Hive of a Different User / From a Different System

    - by Synetech
    I’m currently stuck using my backup system until I can replace my dead motherboard. In the meantime, I have put my hard-drive in this system so that I can access my files and keep working on the backup system. Fortunately, I don’t have a permission issues with the files (the partitions are FAT32). The issue I’m having is with the registry. I need to import some of my settings from the hives of my (old? normal?) installation of Windows into the one I’m currently using. Settings from the system hives (SYSTEM, SOFTWARE, etc.) are fine, but the user hive is giving me trouble. I’ve copied the NTUSER.DAT file from my other drive and mounted it with the reg command. Most of the keys (eg Software) are fine and I can access them without problem, but some of them (particularly the Identities key where Outlook Express settings are stored) complains that it cannot be opened. If I open the permissions dialog, I get an error about being unable to view the current permssions. If I then ignore it and try to take ownership of the key and it’s subkeys, I get an access-denied error. If I then add permissions for my user account on this system, I get an error, however I am then able to see the subkeys and values of the key. If I then try to access the subkeys, I get the same original errors. If I repeat the process for each subkey, I can see their values and subkeys, and so on, but of course this gets to be incredibly annoying and time-consuming (especially since the Identities key has a lot of subkeys). Is there an easier/temporary/more correct way to dump a key so that I can import it into my backup system?

    Read the article

  • apache vhost not working consistently

    - by petrus
    I have a vhost on my webserver whose sole and unique goal is to return the client IP adress: [email protected]:~$ cat /home/vhosts/domain.org/index.php <?php echo $_SERVER['REMOTE_ADDR']; echo "\n" ?> This helps me troubleshoot networking issues, especially when NAT is involved. As such, I don't always have domain name resolution and this service needs to work even if queried by its IP address. I'm using it this way: [email protected]:~$ echo "GET /" | nc 88.191.124.41 80 191.51.4.55 [email protected]:~$ echo "GET /" | nc domain.org 80 191.51.4.55 router#more http://88.191.124.41/index.php 88.191.124.254 However I found that it wasn't working from at least a computer: [email protected]:~$ echo "GET /" | nc domain.org 80 [email protected]:~$ [email protected]:~$ echo "GET /" | nc 88.191.124.41 80 [email protected]:~$ What I checked: This is not related to ipv6: [email protected]:~$ echo "GET /" | nc -4 ydct.org 80 [email protected]:~$ [email protected]:~$ echo "GET /" | nc ydct.org 80 2a01:e35:ee8c:180:21c:77ff:fe30:9e36 netcat version is the same (except platform, i386 vs x64): [email protected]:~$ type nc nc est haché (/bin/nc) [email protected]:~$ file /bin/nc /bin/nc: symbolic link to `/etc/alternatives/nc' [email protected]:~$ ls -l /etc/alternatives/nc lrwxrwxrwx 1 root root 15 2010-06-26 14:01 /etc/alternatives/nc -> /bin/nc.openbsd [email protected]:~$ type nc nc est haché (/bin/nc) [email protected]:~$ file /bin/nc /bin/nc: symbolic link to `/etc/alternatives/nc' [email protected]:~$ ls -l /etc/alternatives/nc lrwxrwxrwx 1 root root 15 2011-05-26 01:23 /etc/alternatives/nc -> /bin/nc.openbsd It works when used without the pipe: [email protected]:~$ nc domain.org 80 GET / 2a01:e35:ee8c:180:221:85ff:fe96:e485 And the piping works at least with a test service (netcat listening on 1234/tcp and output to stdout) [email protected]:~$ nc -l -p 1234 GET / [email protected]:~$ [email protected]:~$ echo "GET /" | nc domain.org 1234 [email protected]:~$ I don't know if this issue is more related to netcat or Apache, but I'd appreciate any pointers to troubleshoot this issue ! The IP addresses have been modified but kept consistent for easy reading. bzn is the server, hive is a working client and seth is the client on which I have the issue.

    Read the article

  • How do I create a new folder and deploy files to the 12 hive using VseWSS 1.3?

    - by Nathan DeWitt
    I have created a web part using VSeWSS 1.3. It creates a wsp file and my web part gets installed, everything works great. I would like to also create a folder in the LAYOUTS directory of the 12 hive and place a couple files in there. How do I go about doing this? I know that I can manually place the files there, but I would prefer to have it all done in one fell swoop when I uses stsadm to install my solution. Is there a best practices guide out there for using VSeWSS 1.3 to do this? They changed a bunch of stuff with this new version and I want to make sure I don't mess anything up.

    Read the article

1 2 3 4 5 6 7  | Next Page >