Search Results

Search found 68715 results on 2749 pages for 'mysql data'.

Page 276/2749 | < Previous Page | 272 273 274 275 276 277 278 279 280 281 282 283  | Next Page >

  • Which MySQL Datatype to use for storing boolean values from/to PHP?

    - by Beat
    Since MySQL doesn't seem to have any 'boolean' datatype, which datatype do you 'abuse' for storing true/false information in MySQL? Especially in the context of writing and reading from/to a PHP-Script. Over time I have used and seen several approaches: tinyint, varchar fields containing the values 0/1, varchar fields containing the strings '0'/'1' or 'true'/'false' and finally enum Fields containing the two options 'true'/'false'. None of the above seems optimal, I tend to prefer the tinyint 0/1 variant, since automatic type conversion in PHP gives me boolean values rather simply. So which datatype do you use, is there a type designed for boolean values which I have overlooked? Do you see any advantages/disadvantages by using one type or another?

    Read the article

  • You have an error in your SQL syntax; check the manual that corresponds to your MySQL

    - by LuisEValencia
    I am trying to run a mysql query to find all occurences of a text. I have a syntax error but dont know where or how to fix it I am using sqlyog to execute this script DECLARE @url VARCHAR(255) SET @url = '1720' SELECT 'select * from ' + RTRIM(tbl.name) + ' where ' + RTRIM(col.name) + ' like %' + RTRIM(@url) + '%' FROM sysobjects tbl INNER JOIN syscolumns col ON tbl.id = col.id AND col.xtype IN (167, 175, 231, 239) -- (n)char and (n)varchar, there may be others to include AND col.length > 30 -- arbitrary min length into which you might store a URL WHERE tbl.type = 'U' -- user defined table 1 queries executed, 0 success, 1 errors, 0 warnings Query: declare @url varchar(255) set @url = '1720' select 'select * from ' + rtrim(tbl.name) + ' where ' + rtrim(col.name) + ' like %' ... Error Code: 1064 You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'declare @url varchar(255)

    Read the article

  • visual description for data structure

    - by radi
    i have a data structure for my compiler (such as ast) , and i need a method to print it (like ms visio) and verify its contents (i need to verify the contents of the ast nodes) note : i dont want to print it to the console , i am using c++ & qt thanks

    Read the article

  • How to crosscheck two tables and insert relevant data into a new table in MYSQL?

    - by JackDamery
    I'm trying to crosscheck a row that exists in two tables using a MYSQL query in phpmyadmin and then if a userID is found in both tables, insert their userID and user name into another table. Here's my code: INSERT INTO userswithoutmeetings SELECT user.userID IF('user.userID'='meeting.userID'); I keep getting plagued by this error: 1064 - You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'IF('user.userID'='meeting.userID')' at line 3 Other statements I've tried have worked but not deposited the values in the table.

    Read the article

  • Amazon EC2 Instance - m1.medium Ubuntu 12.04 - Started to crash three days ago

    - by Joy
    The environment: Amazon EC2 Instance - m1.medium Ubuntu 12.04 Apache 2.2.22 - Running a Drupal Site Using MySQL DB Server RAM info: ~$ free -gt total used free shared buffers cached Mem: 3 1 2 0 0 0 -/+ buffers/cache: 0 2 Swap: 0 0 0 Total: 3 1 2 Hard drive info: Filesystem Size Used Avail Use% Mounted on /dev/xvda1 7.9G 4.7G 2.9G 62% / udev 1.9G 8.0K 1.9G 1% /dev tmpfs 751M 180K 750M 1% /run none 5.0M 0 5.0M 0% /run/lock none 1.9G 0 1.9G 0% /run/shm /dev/xvdb 394G 199M 374G 1% /mnt The problem About two days ago the site started failing becaue the MySQL server was shut down by Apache with the following message: kernel: [2963685.664359] [31716] 106 31716 226946 22748 0 0 0 mysqld kernel: [2963685.664730] Out of memory: Kill process 31716 (mysqld) score 23 or sacrifice child kernel: [2963685.664764] Killed process 31716 (mysqld) total-vm:907784kB, anon-rss:90992kB, file-rss:0kB kernel: [2963686.153608] init: mysql main process (31716) killed by KILL signal kernel: [2963686.169294] init: mysql main process ended, respawning That states that the VM was occupying 0.9GB, but my Ram has 2GB free, so 1GB was still left free. I understand that in Linux applications can allocate more memory than physically available. I don't know if this is the problme, it's the first time that it has started to happen. Obviously, the MySQL server tries to restart, but there's no memory for it apparently and it won't restart. Here is its error log: Plugin 'FEDERATED' is disabled. The InnoDB memory heap is disabled Mutexes and rw_locks use GCC atomic builtins Compressed tables use zlib 1.2.3.4 Initializing buffer pool, size = 128.0M InnoDB: mmap(137363456 bytes) failed; errno 12 Completed initialization of buffer pool Fatal error: cannot allocate memory for the buffer pool Plugin 'InnoDB' init function returned error. Plugin 'InnoDB' registration as a STORAGE ENGINE failed. Unknown/unsupported storage engine: InnoDB [ERROR] Aborting [Note] /usr/sbin/mysqld: Shutdown complete I simply restarted the Mysql service. About two hours later it happened again. I restarted it. Then it happened again 9 hours later. So then I thought of the MaxClients parameter of apache.conf, so I went to check it out. It was set at 150. I decided to drop it down to 60. As so: <IfModule mpm_prefork_module> ... MaxClients 60 </IfModule> <IfModule mpm_worker_module> ... MaxClients 60 </IfModule> <IfModule mpm_event_module> ... MaxClients 60 </IfModule> Once I did that, I had the apache2 service restart and it all went smoothly for 3/4 of a day. Since at night the MySQL service shut down once again, but this time it wasn't killed by the Apache2 service. Instead it called the OOM-Killer with the following message: kernel: [3104680.005312] mysqld invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0 kernel: [3104680.005351] [<ffffffff81119795>] oom_kill_process+0x85/0xb0 kernel: [3104680.548860] init: mysql main process (30821) killed by KILL signal Now I'm out of ideas. Some articles state that the ideal thing to do is change the kernel behaviour with the following (include it to the file /etc/sysctl.conf ) vm.overcommit_memory = 2 vm.overcommit_ratio = 80 So no overcommits will take place. I'm wondering if this is the way to go? Keep in mind I'm no server administrator, I have basic knowldege. Thanks a bunch in advance.

    Read the article

  • Abstracting functionality

    - by Ralf Westphal
    Originally posted on: http://geekswithblogs.net/theArchitectsNapkin/archive/2014/08/22/abstracting-functionality.aspxWhat is more important than data? Functionality. Yes, I strongly believe we should switch to a functionality over data mindset in programming. Or actually switch back to it. Focus on functionality Functionality once was at the core of software development. Back when algorithms were the first thing you heard about in CS classes. Sure, data structures, too, were important - but always from the point of view of algorithms. (Niklaus Wirth gave one of his books the title “Algorithms + Data Structures” instead of “Data Structures + Algorithms” for a reason.) The reason for the focus on functionality? Firstly, because software was and is about doing stuff. Secondly because sufficient performance was hard to achieve, and only thirdly memory efficiency. But then hardware became more powerful. That gave rise to a new mindset: object orientation. And with it functionality was devalued. Data took over its place as the most important aspect. Now discussions revolved around structures motivated by data relationships. (John Beidler gave his book the title “Data Structures and Algorithms: An Object Oriented Approach” instead of the other way around for a reason.) Sure, this data could be embellished with functionality. But nevertheless functionality was second. When you look at (domain) object models what you mostly find is (domain) data object models. The common object oriented approach is: data aka structure over functionality. This is true even for the most modern modeling approaches like Domain Driven Design. Look at the literature and what you find is recommendations on how to get data structures right: aggregates, entities, value objects. I´m not saying this is what object orientation was invented for. But I´m saying that´s what I happen to see across many teams now some 25 years after object orientation became mainstream through C++, Delphi, and Java. But why should we switch back? Because software development cannot become truly agile with a data focus. The reason for that lies in what customers need first: functionality, behavior, operations. To be clear, that´s not why software is built. The purpose of software is to be more efficient than the alternative. Money mainly is spent to get a certain level of quality (e.g. performance, scalability, security etc.). But without functionality being present, there is nothing to work on the quality of. What customers want is functionality of a certain quality. ASAP. And tomorrow new functionality needs to be added, existing functionality needs to be changed, and quality needs to be increased. No customer ever wanted data or structures. Of course data should be processed. Data is there, data gets generated, transformed, stored. But how the data is structured for this to happen efficiently is of no concern to the customer. Ask a customer (or user) whether she likes the data structured this way or that way. She´ll say, “I don´t care.” But ask a customer (or user) whether he likes the functionality and its quality this way or that way. He´ll say, “I like it” (or “I don´t like it”). Build software incrementally From this very natural focus of customers and users on functionality and its quality follows we should develop software incrementally. That´s what Agility is about. Deliver small increments quickly and often to get frequent feedback. That way less waste is produced, and learning can take place much easier (on the side of the customer as well as on the side of developers). An increment is some added functionality or quality of functionality.[1] So as it turns out, Agility is about functionality over whatever. But software developers’ thinking is still stuck in the object oriented mindset of whatever over functionality. Bummer. I guess that (at least partly) explains why Agility always hits a glass ceiling in projects. It´s a clash of mindsets, of cultures. Driving software development by demanding small increases in functionality runs against thinking about software as growing (data) structures sprinkled with functionality. (Excuse me, if this sounds a bit broad-brush. But you get my point.) The need for abstraction In the end there need to be data structures. Of course. Small and large ones. The phrase functionality over data does not deny that. It´s not functionality instead of data or something. It´s just over, i.e. functionality should be thought of first. It´s a tad more important. It´s what the customer wants. That´s why we need a way to design functionality. Small and large. We need to be able to think about functionality before implementing it. We need to be able to reason about it among team members. We need to be able to communicate our mental models of functionality not just by speaking about them, but also on paper. Otherwise reasoning about it does not scale. We learned thinking about functionality in the small using flow charts, Nassi-Shneiderman diagrams, pseudo code, or UML sequence diagrams. That´s nice and well. But it does not scale. You can use these tools to describe manageable algorithms. But it does not work for the functionality triggered by pressing the “1-Click Order” on an amazon product page for example. There are several reasons for that, I´d say. Firstly, the level of abstraction over code is negligible. It´s essentially non-existent. Drawing a flow chart or writing pseudo code or writing actual code is very, very much alike. All these tools are about control flow like code is.[2] In addition all tools are computationally complete. They are about logic which is expressions and especially control statements. Whatever you code in Java you can fully (!) describe using a flow chart. And then there is no data. They are about control flow and leave out the data altogether. Thus data mostly is assumed to be global. That´s shooting yourself in the foot, as I hope you agree. Even if it´s functionality over data that does not mean “don´t think about data”. Right to the contrary! Functionality only makes sense with regard to data. So data needs to be in the picture right from the start - but it must not dominate the thinking. The above tools fail on this. Bottom line: So far we´re unable to reason in a scalable and abstract manner about functionality. That´s why programmers are so driven to start coding once they are presented with a problem. Programming languages are the only tool they´ve learned to use to reason about functional solutions. Or, well, there might be exceptions. Mathematical notation and SQL may have come to your mind already. Indeed they are tools on a higher level of abstraction than flow charts etc. That´s because they are declarative and not computationally complete. They leave out details - in order to deliver higher efficiency in devising overall solutions. We can easily reason about functionality using mathematics and SQL. That´s great. Except for that they are domain specific languages. They are not general purpose. (And they don´t scale either, I´d say.) Bummer. So to be more precise we need a scalable general purpose tool on a higher than code level of abstraction not neglecting data. Enter: Flow Design. Abstracting functionality using data flows I believe the solution to the problem of abstracting functionality lies in switching from control flow to data flow. Data flow very naturally is not about logic details anymore. There are no expressions and no control statements anymore. There are not even statements anymore. Data flow is declarative by nature. With data flow we get rid of all the limiting traits of former approaches to modeling functionality. In addition, nomen est omen, data flows include data in the functionality picture. With data flows, data is visibly flowing from processing step to processing step. Control is not flowing. Control is wherever it´s needed to process data coming in. That´s a crucial difference and needs some rewiring in your head to be fully appreciated.[2] Since data flows are declarative they are not the right tool to describe algorithms, though, I´d say. With them you don´t design functionality on a low level. During design data flow processing steps are black boxes. They get fleshed out during coding. Data flow design thus is more coarse grained than flow chart design. It starts on a higher level of abstraction - but then is not limited. By nesting data flows indefinitely you can design functionality of any size, without losing sight of your data. Data flows scale very well during design. They can be used on any level of granularity. And they can easily be depicted. Communicating designs using data flows is easy and scales well, too. The result of functional design using data flows is not algorithms (too low level), but processes. Think of data flows as descriptions of industrial production lines. Data as material runs through a number of processing steps to be analyzed, enhances, transformed. On the top level of a data flow design might be just one processing step, e.g. “execute 1-click order”. But below that are arbitrary levels of flows with smaller and smaller steps. That´s not layering as in “layered architecture”, though. Rather it´s a stratified design à la Abelson/Sussman. Refining data flows is not your grandpa´s functional decomposition. That was rooted in control flows. Refining data flows does not suffer from the limits of functional decomposition against which object orientation was supposed to be an antidote. Summary I´ve been working exclusively with data flows for functional design for the past 4 years. It has changed my life as a programmer. What once was difficult is now easy. And, no, I´m not using Clojure or F#. And I´m not a async/parallel execution buff. Designing the functionality of increments using data flows works great with teams. It produces design documentation which can easily be translated into code - in which then the smallest data flow processing steps have to be fleshed out - which is comparatively easy. Using a systematic translation approach code can mirror the data flow design. That way later on the design can easily be reproduced from the code if need be. And finally, data flow designs play well with object orientation. They are a great starting point for class design. But that´s a story for another day. To me data flow design simply is one of the missing links of systematic lightweight software design. There are also other artifacts software development can produce to get feedback, e.g. process descriptions, test cases. But customers can be delighted more easily with code based increments in functionality. ? No, I´m not talking about the endless possibilities this opens for parallel processing. Data flows are useful independently of multi-core processors and Actor-based designs. That´s my whole point here. Data flows are good for reasoning and evolvability. So forget about any special frameworks you might need to reap benefits from data flows. None are necessary. Translating data flow designs even into plain of Java is possible. ?

    Read the article

  • drop down and post data to data base

    - by DAFFODIL
    This is a form which retrieves data from db and displays them in table. At the beginning of each row there will be a check box. If there are 10 rows fetched, I ii check 5 rows and insert them in to diff db but here when, I click drop down box data is getting in to db automatically,bcoz I use onchange event. Any alternative to prevent this to happen. Data should be inserted only when, I click submit button. Any help will be appreciated <?php $con = mysql_connect("localhost","root",""); if (!$con) { die('Could not connect: ' . mysql_error()); } mysql_select_db("form1", $con); error_reporting(E_ALL ^ E_NOTICE); $nam=$_REQUEST['select1']; $row=mysql_query("select * from inv where name='$nam'"); while($row1=mysql_fetch_array($row)) { $Name=$row1['Name']; $Address =$row1['Address']; $City=$row1['City']; $Pincode=$row1['Pincode']; $No=$row1['No']; $Date=$row1['Date']; $DCNo=$row1['DCNo']; $DcDate=$row1['DcDate']; $YourOrderNo=$row1['YourOrderNo']; $OrderDate=$row1['OrderDate']; $VendorCode=$row1['VendorCode']; $SNo=$row1['SNo']; $descofgoods=$row1['descofgoods']; $Qty=$row1['Qty']; $Rate=$row1['Rate']; $Amount=$row1['Amount']; } ?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /> <title>Untitled Document</title> <script type="text/javascript"> function ram(id) { var q=document.getElementById('qty_'+id).value; var r=document.getElementById('rate_'+id).value; document.getElementById('amt_'+id).value=q*r; } </script> </head> <body> <form id="form1" name="form1" method="post" action=""> <table width="1315" border="0"> <script type="text/javascript"> function g() { form1.submit(); } </script> <tr> <th>Name</th> <th align="left"><select name="select1" onchange="g();"> <option value="" selected="selected">select</option> <?php $row=mysql_query("select Name from inv "); while($row1=mysql_fetch_array($row)) { ?> <option value="<?php echo $row1['Name'];?>"><?php echo $row1['Name'];?></option> <?php } ?> </select></th> </tr> <tr> <th>Address</th> <th align="left"><textarea name="Address"><?php echo $Address;?></textarea></th> </tr> <tr> <th>City</th> <th align="left"><input type="text" name="City" value='<?php echo $City;?>' /></th> </tr> <tr> <th>Pincode</th> <th align="left"><input type="text" name="Pincode" value='<?php echo $Pincode;?>'></th> </tr> <tr> <th>No</th> <th align="left"><input type="text" name="No2" value='<?php echo $No;?>' readonly="" /></th> </tr> <tr> <th>Date</th> <th align="left"><input type="text" name="Date" value='<?php echo $Date;?>' /></th> </tr> <tr> <th>DCNo</th> <th align="left"><input type="text" name="DCNo" value='<?php echo $DCNo;?>' readonly="" /></th> </tr> <tr> <th>DcDate:</th> <th align="left"><input type="text" name="DcDate" value='<?php echo $DcDate;?>' /></th> </tr> <tr> <th>YourOrderNo</th> <th align="left"><input type="text" name="YourOrderNo" value='<?php echo $YourOrderNo;?>' readonly="" /></th> </tr> <tr> <th>OrderDate</th> <th align="left"><input type="text" name="OrderDate" value='<?php echo $OrderDate;?>' /></th> </tr> <tr> <th width="80">VendorCode</th> <th width="1225" align="left"><input type="text" name="VendorCode" value='<?php echo $VendorCode;?>' readonly="" /></th> </tr> </table> <table width="1313" border="0"> <tr> <td width="44">&nbsp;</td> <td width="71">SNO</td> <td width="527">DESCRIPTION</td> <td width="214">QUANTITY</td> <td width="214">RATE/UNIT</td> <td width="217">AMOUNT</td> </tr> <?php $i=1; $row=mysql_query("select * from inv where Name='$nam'"); while($row1=mysql_fetch_array($row)) { $SNo=$row1['SNo']; $descofgoods=$row1['descofgoods']; $Qty=$row1['Qty']; $Rate=$row1['Rate']; $Amount=$row1['Amount']; ?> <tr> <td><input type="checkbox" name="checkbox" value="checkbox" checked="checked"/></td> <td><input type="text" name="No[<?php echo $i?>]" value='<?php echo $SNo;?>' readonly=""/></td> <td><input type="text" name="descofgoods[<?php echo $i?>]" value='<?php echo $descofgoods;?>' /></td> <td><input type="text" name="qty[<?php echo $i?>]" maxlength="50000000" id="qty_<?PHP echo($i) ?>"/></td> <td><input type="text" name="Rate[<?php echo $i?>]" value='<?php echo $Rate;?>' id="rate_<?PHP echo($i) ?>" onclick="ram('<?PHP echo($i) ?>')";></td> <td><input type="text" name="Amount[<?php echo $i?>]" id="amt_<?PHP echo($i) ?>"/></td> </tr> <?php $i++;} ?> <tr> <td><input type="submit" value="submit" header("location:values to be brought for print page.php");/></td> </tr> </table> <label></label> </form> </body> </html> <?php /*error_reporting(E_ALL ^ E_NOTICE); $con = mysql_connect("localhost","root",""); if (!$con) { die('Could not connect: ' . mysql_error()); } mysql_select_db("form1", $con); /*if(checked=checkbox) { mysql_query="INSERT INTO invo (Name, Address, City, Pincode, No, Date, DCNo, DcDate, YourOrderNo, OrderDate, VendorCode, SNo, descofgoods, Qty, Rate, Amount) VALUES ('$_POST[Name]','$_POST[Address]','$_POST[City]','$_POST[Pincode]','$_POST[No]','$_POST[Date]','$_POST[DCNo]','$_POST[DcDate]','$_POST[YourOrderNo]','$_POST[OrderDate]','$_POST[VendorCode]','$_POST[SNo]','$_POST[descofgoods]','$_POST[qty]','$_POST[Rate]','$_POST[Amount]')"; } else { header("location:values to be brought for print page.php"); }*/ header("ins.php"); ?>

    Read the article

  • OBIEE 11.1.1 - Disable Wrap Data Types in WebLogic Server 10.3.x

    - by Ahmed Awan
    By default, JDBC data type’s objects are wrapped with a WebLogic wrapper. This allows for features like debugging output and track connection usage to be done by the server. The wrapping can be turned off by setting this value to false. This improves performance, in some cases significantly, and allows for the application to use the native driver objects directly. Tip: How to Disable Wrapping in WLS Administration Console You can use the Administration Console to disable data type wrapping for following JDBC data sources in bifoundation_domain domain: Data Source Name bip_datasource mds-owsm EPMSystemRegistry   To disable wrapping for each JDBC data source (as stated in above table): 1.     If you have not already done so, in the Change Center of the Administration Console, click Lock & Edit. 2.     In the Domain Structure tree, expand Services, then select Data Sources. 3.     On the Summary of Data Sources page, click the data source name for example “mds-owsm”. 4.     Select the Configuration: Connection Pool tab. 5.     Scroll down and click Advanced to show the advanced connection pool options. 6.     In Wrap Data Types, deselect the checkbox to disable wrapping. 7.     Click Save. 8.     To activate these changes, in the Change Center of the Administration Console, click Activate Changes. Important Note: This change does not take effect immediately—it requires the server be restarted.

    Read the article

  • Building Simple Workflows in Oozie

    - by dan.mcclary
    Introduction More often than not, data doesn't come packaged exactly as we'd like it for analysis. Transformation, match-merge operations, and a host of data munging tasks are usually needed before we can extract insights from our Big Data sources. Few people find data munging exciting, but it has to be done. Once we've suffered that boredom, we should take steps to automate the process. We want codify our work into repeatable units and create workflows which we can leverage over and over again without having to write new code. In this article, we'll look at how to use Oozie to create a workflow for the parallel machine learning task I described on Cloudera's site. Hive Actions: Prepping for Pig In my parallel machine learning article, I use data from the National Climatic Data Center to build weather models on a state-by-state basis. NCDC makes the data freely available as gzipped files of day-over-day observations stretching from the 1930s to today. In reading that post, one might get the impression that the data came in a handy, ready-to-model files with convenient delimiters. The truth of it is that I need to perform some parsing and projection on the dataset before it can be modeled. If I get more observations, I'll want to retrain and test those models, which will require more parsing and projection. This is a good opportunity to start building up a workflow with Oozie. I store the data from the NCDC in HDFS and create an external Hive table partitioned by year. This gives me flexibility of Hive's query language when I want it, but let's me put the dataset in a directory of my choosing in case I want to treat the same data with Pig or MapReduce code. CREATE EXTERNAL TABLE IF NOT EXISTS historic_weather(column 1, column2) PARTITIONED BY (yr string) STORED AS ... LOCATION '/user/oracle/weather/historic'; As new weather data comes in from NCDC, I'll need to add partitions to my table. That's an action I should put in the workflow. Similarly, the weather data requires parsing in order to be useful as a set of columns. Because of their long history, the weather data is broken up into fields of specific byte lengths: x bytes for the station ID, y bytes for the dew point, and so on. The delimiting is consistent from year to year, so writing SerDe or a parser for transformation is simple. Once that's done, I want to select columns on which to train, classify certain features, and place the training data in an HDFS directory for my Pig script to access. ALTER TABLE historic_weather ADD IF NOT EXISTS PARTITION (yr='2010') LOCATION '/user/oracle/weather/historic/yr=2011'; INSERT OVERWRITE DIRECTORY '/user/oracle/weather/cleaned_history' SELECT w.stn, w.wban, w.weather_year, w.weather_month, w.weather_day, w.temp, w.dewp, w.weather FROM ( FROM historic_weather SELECT TRANSFORM(...) USING '/path/to/hive/filters/ncdc_parser.py' as stn, wban, weather_year, weather_month, weather_day, temp, dewp, weather ) w; Since I'm going to prepare training directories with at least the same frequency that I add partitions, I should also add that to my workflow. Oozie is going to invoke these Hive actions using what's somewhat obviously referred to as a Hive action. Hive actions amount to Oozie running a script file containing our query language statements, so we can place them in a file called weather_train.hql. Starting Our Workflow Oozie offers two types of jobs: workflows and coordinator jobs. Workflows are straightforward: they define a set of actions to perform as a sequence or directed acyclic graph. Coordinator jobs can take all the same actions of Workflow jobs, but they can be automatically started either periodically or when new data arrives in a specified location. To keep things simple we'll make a workflow job; coordinator jobs simply require another XML file for scheduling. The bare minimum for workflow XML defines a name, a starting point, and an end point: <workflow-app name="WeatherMan" xmlns="uri:oozie:workflow:0.1"> <start to="ParseNCDCData"/> <end name="end"/> </workflow-app> To this we need to add an action, and within that we'll specify the hive parameters Also, keep in mind that actions require <ok> and <error> tags to direct the next action on success or failure. <action name="ParseNCDCData"> <hive xmlns="uri:oozie:hive-action:0.2"> <job-tracker>localhost:8021</job-tracker> <name-node>localhost:8020</name-node> <configuration> <property> <name>oozie.hive.defaults</name> <value>/user/oracle/weather_ooze/hive-default.xml</value> </property> </configuration> <script>ncdc_parse.hql</script> </hive> <ok to="WeatherMan"/> <error to="end"/> </action> There are a couple of things to note here: I have to give the FQDN (or IP) and port of my JobTracker and NameNode. I have to include a hive-default.xml file. I have to include a script file. The hive-default.xml and script file must be stored in HDFS That last point is particularly important. Oozie doesn't make assumptions about where a given workflow is being run. You might submit workflows against different clusters, or have different hive-defaults.xml on different clusters (e.g. MySQL or Postgres-backed metastores). A quick way to ensure that all the assets end up in the right place in HDFS is just to make a working directory locally, build your workflow.xml in it, and copy the assets you'll need to it as you add actions to workflow.xml. At this point, our local directory should contain: workflow.xml hive-defaults.xml (make sure this file contains your metastore connection data) ncdc_parse.hql Adding Pig to the Ooze Adding our Pig script as an action is slightly simpler from an XML standpoint. All we do is add an action to workflow.xml as follows: <action name="WeatherMan"> <pig> <job-tracker>localhost:8021</job-tracker> <name-node>localhost:8020</name-node> <script>weather_train.pig</script> </pig> <ok to="end"/> <error to="end"/> </action> Once we've done this, we'll copy weather_train.pig to our working directory. However, there's a bit of a "gotcha" here. My pig script registers the Weka Jar and a chunk of jython. If those aren't also in HDFS, our action will fail from the outset -- but where do we put them? The Jython script goes into the working directory at the same level as the pig script, because pig attempts to load Jython files in the directory from which the script executes. However, that's not where our Weka jar goes. While Oozie doesn't assume much, it does make an assumption about the Pig classpath. Anything under working_directory/lib gets automatically added to the Pig classpath and no longer requires a REGISTER statement in the script. Anything that uses a REGISTER statement cannot be in the working_directory/lib directory. Instead, it needs to be in a different HDFS directory and attached to the pig action with an <archive> tag. Yes, that's as confusing as you think it is. You can get the exact rules for adding Jars to the distributed cache from Oozie's Pig Cookbook. Making the Workflow Work We've got a workflow defined and have collected all the components we'll need to run. But we can't run anything yet, because we still have to define some properties about the job and submit it to Oozie. We need to start with the job properties, as this is essentially the "request" we'll submit to the Oozie server. In the same working directory, we'll make a file called job.properties as follows: nameNode=hdfs://localhost:8020 jobTracker=localhost:8021 queueName=default weatherRoot=weather_ooze mapreduce.jobtracker.kerberos.principal=foo dfs.namenode.kerberos.principal=foo oozie.libpath=${nameNode}/user/oozie/share/lib oozie.wf.application.path=${nameNode}/user/${user.name}/${weatherRoot} outputDir=weather-ooze While some of the pieces of the properties file are familiar (e.g., JobTracker address), others take a bit of explaining. The first is weatherRoot: this is essentially an environment variable for the script (as are jobTracker and queueName). We're simply using them to simplify the directives for the Oozie job. The oozie.libpath pieces is extremely important. This is a directory in HDFS which holds Oozie's shared libraries: a collection of Jars necessary for invoking Hive, Pig, and other actions. It's a good idea to make sure this has been installed and copied up to HDFS. The last two lines are straightforward: run the application defined by workflow.xml at the application path listed and write the output to the output directory. We're finally ready to submit our job! After all that work we only need to do a few more things: Validate our workflow.xml Copy our working directory to HDFS Submit our job to the Oozie server Run our workflow Let's do them in order. First validate the workflow: oozie validate workflow.xml Next, copy the working directory up to HDFS: hadoop fs -put working_dir /user/oracle/working_dir Now we submit the job to the Oozie server. We need to ensure that we've got the correct URL for the Oozie server, and we need to specify our job.properties file as an argument. oozie job -oozie http://url.to.oozie.server:port_number/ -config /path/to/working_dir/job.properties -submit We've submitted the job, but we don't see any activity on the JobTracker? All I got was this funny bit of output: 14-20120525161321-oozie-oracle This is because submitting a job to Oozie creates an entry for the job and places it in PREP status. What we got back, in essence, is a ticket for our workflow to ride the Oozie train. We're responsible for redeeming our ticket and running the job. oozie -oozie http://url.to.oozie.server:port_number/ -start 14-20120525161321-oozie-oracle Of course, if we really want to run the job from the outset, we can change the "-submit" argument above to "-run." This will prep and run the workflow immediately. Takeaway So, there you have it: the somewhat laborious process of building an Oozie workflow. It's a bit tedious the first time out, but it does present a pair of real benefits to those of us who spend a great deal of time data munging. First, when new data arrives that requires the same processing, we already have the workflow defined and ready to run. Second, as we build up a set of useful action definitions over time, creating new workflows becomes quicker and quicker.

    Read the article

  • SQLAuthority News – Scaling Up Your Data Warehouse with SQL Server 2008 R2

    - by pinaldave
    Data Warehouses are suppose to be containing huge amount of the data from the beginning. However, there are cases when too big is not enough. Every Data Warehouse Admin will agree that they have faced situation where they will need to scale up their data warehouse. Microsoft has released white paper discussing the same. Here is the abstract from the Microsoft Official site: SQL Server 2008 introduced many new functional and performance improvements for data warehousing, and SQL Server 2008 R2 includes all these and more. This paper discusses how to use SQL Server 2008 R2 to get great performance as your data warehouse scales up. We present lessons learned during extensive internal data warehouse testing on a 64-core HP Integrity Superdome during the development of the SQL Server 2008 release, and via production experience with large-scale SQL Server customers. Our testing indicates that many customers can expect their performance to nearly double on the same hardware they are currently using, merely by upgrading to SQL Server 2008 R2 from SQL Server 2005 or earlier, and compressing their fact tables. We cover techniques to improve manageability and performance at high-scale, encompassing data loading (extract, transform, load), query processing, partitioning, index maintenance, indexed view (aggregate) management, and backup and restore. Scaling Up Your Data Warehouse with SQL Server 2008 R2 Reference: Pinal Dave (http://blog.SQLAuthority.com)   Filed under: PostADay, SQL, SQL Authority, SQL Documentation, SQL Download, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • First Look - Oracle Data Mining

    - by kimberly.billings
    In his blog, JT on EDM, James Taylor shares his analysis of Oracle Data Mining, including its new GUI and Exadata integration. While Oracle Data Mining has been available for a while, it is now easier to access and try via the Amazon Cloud. Using the Oracle 11gR2 Data Mining Amazon Machine Image (AMI), you can launch an Oracle Data Mining-enabled instance directly through Amazon Web Services (AWS) and connect to it using the Oracle Data Miner graphical user interface. The new Oracle Data Mining GUI, which will be available to beta customers soon, provides more graphics, the ability to define, save and share analytical "work flows" to solve business problems, and provides more automation and simplicity. Taylor comments that, "the UI looks to have a nice look and feel including graphical model development flows, easy access to the data, nice little micro graphs when browsing data records and more." On using Oracle Data Mining with Exadata, Taylor writes, "Oracle says that the use of the ODM routines in the Exadata kernel is faster than running a native ODM model in the database by a factor of 2 and that this increases as more joins are used. This could mean that ODM outperforms even third party in-database analytics." Taylor concludes his blog with a positive overall review, stating that "ODM is a nice product for Oracle database customers and well worth looking into. The new UI will only make it more so." Read the blog. var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www."); document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E")); try { var pageTracker = _gat._getTracker("UA-13185312-1"); pageTracker._trackPageview(); } catch(err) {}

    Read the article

  • MMO Data Persistence Question

    - by JasonG
    I wanted to ask a question regarding data persistence strategies for an MMO. I have some experience in the games industry with social synchronous games. At Zynga, we stored static proto data in XML on both the client and the server and stored instance/runtime data in membase. For clarity sake, proto data for a Potion would be PotionName or MaxCharges, while runtime/instance data would be something like ChargesRemaining. So basically, if a player picks up a potion the instance is (via prediction) created from XML data on the client, the request gets sent to the server where the instance is created from XML and then added to membase. Is the same strategy that would be used for soemthing like an MMO? Would it be feasible to have static proto data in some kind of in-memory no-sql database on both client and server with instance data being stored on the server in a more enterprise level database? Or should all data (proto/instance) be stored on the server and the client gets everything from server? I know a lot of this might on certain game requirements, however, i'm basically looking for some general opinion/best practices here if there are any.

    Read the article

  • Podcast Show Notes: The Big Deal About Big Data

    - by Bob Rhubart
    This week the OTN ArchBeat kicks off a three-part series that looks at Big Data: what it is, its affect on enterprise IT, and what architects need to do to stay ahead of the big data curve. My guests for this conversation are Jean-Pierre Dijks and Andrew Bond . Jean-Pierre, based at Oracle HQ in Redwood Shores, CA, is product manager for Oracle Big Data Appliance and Oracle's big data strategy. Andrew Bond  is Head of Transformation Architecture for Oracle, where he specialzes in Data Warehousing, Business Intelligence, and Big Data. Andrew is based in the UK, but for this conversation he dialed in from a car somewhere on the streets of Amsterdam. Listen to Part 1What is Big Data, really, and why does it matter? Listen to Part 2 (Oct 10)What new challenges does Big Data present for Architects? What do architects need to do to prepare themselves and their environments? Listen to Part 3 (Oct 17)Who is driving the adoption of Big Data strategies in organizations, and why? Additional Resources http://blogs.oracle.com/datawarehousing http://www.facebook.com/pages/OracleBigData https://twitter.com/#!/OracleBigData Coming Soon A conversation about how the rapidly evolving enterprise IT landscape is transforming the roles, responsibilities, and skill requirements for architects and developers. Stay tuned: RSS

    Read the article

  • Database Replication check script not running

    - by Tarun
    I'm trying to create a Database Replication checking script but I'm getting error while executing it. Here is the script #!/bin/bash PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin export PATH #Server Name Server="Test Server" #My Sql Username and Password User=root Password="a" #Maximum Slave Time Delay Delay="60" #File Path to store error and email the same Log_File=/tmp/replicationcheck.txt #Email Settings Subject="$Server Replication Error" Sender_Name=TestServer Recipients="[email protected]" #Mail Alert Function mailalert(){ sendmail -F $Sender_Name -it <<END_MESSAGE To: $Recipients Subject: $Subject $Message_Replication_Error `/bin/cat $Log_File` END_MESSAGE } #Show Slave Status Show_Slave_Status=`echo "show slave status \G;" | mysql -u $User -p$Password 2>&1` #Getting list of queries in mysql $Show_Slave_Status | grep "Last_" > $Log_File #Check if slave running $Show_Slave_Status | grep "Slave_IO_Running: No" if [ "$?" -eq "0" ]; then Message_Replication_Error="$Server Replication error please check. The Slave_IO_Running state is No." mailalert exit 1 else $Show_Slave_Status | grep "Slave_IO_Running: Connecting" if [ "$?" -eq "0" ]; then Message_Replication_Error="$Server Replication error please check. The Slave_IO_Running state is Connecting." mailalert exit 1 fi fi #Check if replication delayed Seconds_Behind_Master=$Show_Slave_Status | grep "Seconds_Behind_Master" | awk -F": " {' print $2 '} if [ "$Seconds_Behind_Master" -ge "$Delay" ]; then Message_Replication_Error="Replication Delayed by $Seconds_Behind_Master." mailalert else if [ "$Seconds_Behind_Master" = "NULL" ]; then Message_Replication_Error="$Server Replication error please check. The Seconds_Behind_Master state is NULL." mailalert fi fi

    Read the article

  • Developer Training – 6 Online Courses to Learn SQL Server, MySQL and Technology

    - by Pinal Dave
    Video courses are the next big thing and I am so happy that I have so far authored 6 different video courses with Pluralsight. Here is the list of the courses. I have listed all of my video courses over here. Note: If you click on the courses and it does not open, you need to login to Pluralsight with a valid username and password or sign up for a FREE trial. Please leave a comment with your favorite course in the comment section. Random 10 winners will get surprise gift via email. Bonus: If you list your favorite module from the course site. SQL Server Performance: Introduction to Query Tuning SQL Server performance tuning is an in-depth topic, and an art to master. A key component of overall application performance tuning is query tuning. Writing queries in an efficient manner, and making sure they execute in the most optimal way possible, is always a challenge. The basics revolve around the details of how SQL Server carries out query execution, so the optimizations explored in this course follow along the same lines. Click to View Course SQL Server Performance: Indexing Basics Indexes are the most crucial objects of the database. They are the first stop for any DBA and Developer when it is about performance tuning. There is a good side as well evil side of the indexes. To master the art of performance tuning one has to understand the fundamentals of the indexes and the best practices associated with the same. This course is for every DBA and Developer who deals with performance tuning and wants to use indexes to improve the performance of the server. Click to View Course SQL Server Questions and Answers This course is designed to help you better understand how to use SQL Server effectively. The course presents many of the common misconceptions about SQL Server, and then carefully debunks those misconceptions with clear explanations and short but compelling demos, showing you how SQL Server really works. This course is for anyone working with SQL Server databases who wants to improve her knowledge and understanding of this complex platform. Click to View Course MySQL Fundamentals MySQL is a popular choice of database for use in web applications, and is a central component of the widely used LAMP open source web application software stack. This course covers the fundamentals of MySQL, including how to install MySQL as well as written basic data retrieval and data modification queries. Click to View Course Building a Successful Blog Expressing yourself is the most common behavior of humans. Blogging has made easy to express yourself. Just like a letter or book has a structure and formula, blogging also has structure and formula. In this introductory course on blogging we will go over a few of the basics of blogging and show the way to get started with blogging immediately. If you already have a blog, this course will be even more relevant as this will discuss many of the common questions and issue you face in your blogging routine. Click to View Course Introduction to ColdFusion ColdFusion is rapid web application development platform. In this course you will learn the basics of how to use ColdFusion platform and rapidly develop web sites. The course begins with learning basics of ColdFusion Markup Language and moves to common development language practices. From there we move to frequent database operations and advanced concepts of Forms, Sessions and Cookies. The last module sums up all the concepts covered in the course with sample application. Click to View Course Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, SQL Training, T SQL, Technology

    Read the article

  • Sort algorithms that work on large amount of data

    - by Giorgio
    I am looking for sorting algorithms that can work on a large amount of data, i.e. that can work even when the whole data set cannot be held in main memory at once. The only candidate that I have found up to now is merge sort: you can implement the algorithm in such a way that it scans your data set at each merge without holding all the data in main memory at once. The variation of merge sort I have in mind is described in this article in section Use with tape drives. I think this is a good solution (with complexity O(n x log(n)) but I am curious to know if there are other (possibly faster) sorting algorithms that can work on large data sets that do not fit in main memory. EDIT Here are some more details, as required by the answers: The data needs to be sorted periodically, e.g. once in a month. I do not need to insert a few records and have the data sorted incrementally. My example text file is about 1 GB UTF-8 text, but I wanted to solve the problem in general, even if the file were, say, 20 GB. It is not in a database and, due to other constraints, it cannot be. The data is dumped by others as a text file, I have my own code to read this text file. The format of the data is a text file: new line characters are record separators. One possible improvement I had in mind was to split the file into files that are small enough to be sorted in memory, and finally merge all these files using the algorithm I have described above.

    Read the article

  • Java Desktop Application For Network users

    - by Motasem Abu Aker
    I'm developing a desktop application using Java. My application will run in a network environment where multiple users will access the same database through the application. There will be basic CRUD opreations (Insert, Update, Delete, & select), which means there will be chances of deadlock, or two users trying to update the same record at same time. I'm using the following Java Swing for Clients (MVC). MySQL Server for database (InnODB). Java Web start. Now, MySQL is centralized on the network, and all of the clients connect to it. The Application for ERP Purpose. I searched the internet to find a very good solution to ensure data integrity & to make sure that when updating one record from one client, other clients are aware of it. I read about Socket-server-client & RESTful web services. I don't want to go web application & don't want to use any extra libraries. So how can I handle this scenario: If User A updates a record: Is there a way to update User B's screen with the new value? If user A starts updating a record, how can I prevent other users from attempting to update the same record?

    Read the article

  • What is the best approach for database design with lots of columns?

    - by Pratyush
    I am writing a query based financial application. It lets the user to write complicated equations (much like WHERE part of an SQL query) and find companies matching those criteria. For the above, I currently have more than 500 columns in the database table (each column representing a financial field). Example of Columns are: company_name, sales_annual_00, sales_annual_01, sales_annual_02, sales_annual_03, sales_annual_04, protit_annual_00, profit_annual1...(over 500 such columns). The number of rows is around 5000. Going forward, I would like to further increase the number of columns/financial-fields. For the above I would like to get help regarding: 1) What is the best database design approach? Is it ok to have these many number of columns? 2) How can it be normalized? (User can use any of these fields in search criteria). 3) Is it ok to stick with MySQL, or modern document based databases like MongoDB should be better for it? P.S. (Update): I have been using MySQL till now and a running example of the usage is at: http://screener.in/companies/89/Formula-- In above there around 500 fields/columns to create your query on, however, I seek to increase that number to much more in future.

    Read the article

  • When is it better to offload work to the RDBMS rather than to do it in code?

    - by GeminiDomino
    Okay, I'll cop to it: I'm a better coder than I am at databases, and I'm wondering where thoughts on "best practices" lie on the subject of doing "simple" calculations in the SQL query vs. in the code, such as this MySQL example (I didn't write it, I just have to maintain it!) -- This returns the username, and the users age as of the last event. SELECT u.username as user, IF ((DAY(max(e.date)) - DAY(u.DOB)) &lt; 0 , TRUNCATE(((((YEAR(max(e.date))*12)+MONTH(max(e.date))) -((YEAR(u.DOB)*12)+MONTH(u.DOB)))-1)/12, 0), TRUNCATE((((YEAR(max(e.date))*12)+MONTH(max(e.date))) - ((YEAR(u.DOB)*12)+MONTH(u.DOB)))/12, 0)) AS age FROM users as u JOIN events as e ON u.id = e.uid ... Compared to doing the "heavy" lifting in code: Query: SELECT u.username, u.DOB as dob, e.event_date as edate FROM users as u JOIN events as e ON u.id = e.uid code: function ageAsOfDate($birth, $aod) { //expects dates in mysql Y-m-d format... list($by,$bm,$bd) = explode('-',$birth); list($ay,$am,$ad) = explode('-',$aod); //Insert Calculations here ... return $Dy; //Difference in years } echo "Hey! ". $row['user'] ." was ". ageAsOfDate($row['dob'], $row['edate']) . " when we last saw him."; I'm pretty sure in a simple case like this it wouldn't make much difference (other than the creeping feeling of horror when I have to make changes to queries like the first one), but I think it makes it clearer what I'm looking for. Thanks!

    Read the article

  • Announcing Oracle Receivables Generic Data Fix (GDF) for Refunds

    - by user793553
    Here's the first of what will be a series of Generic Data Fixes (GDF) to be released by Receivables Development. Generic Data Fix (GDF) are created by development to fix data issues caused by bugs/issues in the application code.  Other Generic Data Fix benefits/features include: Developed for bugs that can cause data issues. Provides a SELECT script that uses an identification/signature query to identify and report all data affected by issue/condition caused by a bug. Allow customers to view and modify what will be fixed. Provides a separate FIX script to fix the data reported by the SELECT script. The FIX script creates backup tables for the data that is fixed/updated. Available on My Oracle Support for download In Release 12, when creating a refund by either of the following methods: Applied a receipt to the Refund activity - which creates an Invoice in Payables Or you went directly into Payables to create a refund for an open Credit Memo in Receivables The Invoice in Payables that is associated to the refund is cancelled, the corresponding refund application or credit memo in Receivables is not properly re-instated. For the receipt application, it still remains applied to the Refund whereas this should be automatically unapplied. For the credit memo, it stays closed instead of getting re-opened. Doc ID 761993.1 includes the patch to make sure this doesn’t happen in the future as well as a GDF script to fix the current data (Script name: ar_std_refund_unapp.sql).  Download the script and run in READ_ONLY_MODE to identify 'refund' applications with this problem. Stay tuned for more GDF scripts coming soon...

    Read the article

  • For a large website developed in PHP, is it necessary to have a framework?

    - by Martin
    I am wondering if it is necessary to have a framework or if it is a must-have if I plan to make a large website. Large website could mean a lot of things: in other words, multiple dynamic web pages (40-50 dynamic pages, mysql content) and a lot of visitors (+- a million hits per month). The site will be hosted in a dedicated server environment. I know that it could simplify coding for a developer team, that it includes libraries and a lot of advantages. But I just feel that I don't need that. I think that learning how it works, managing it and installing it would take more time and I could use that time to code. I write PHP the simplest way I could (with performance in mind) and I try to reuse my code/functions/classes most of the time and I make sure that if another developer joins the team, that he won't be lost in the code. I am also planning to use MemCached or another Cache for PHP. As I said, the site will be hosted in a dedicated server environment but will be entirely managed by the hosting company. I am pretty sure the control panel for me to control the basic stuff will be Cpanel. For a developer like me that only knows PHP, Javascript, HTML, CSS, MYSQL and really basic server management, I feel that it seems to complicated to have a framework. Am I wrong? Is it worth the time to learn all about it? Thank you for your opinions and suggestions.

    Read the article

< Previous Page | 272 273 274 275 276 277 278 279 280 281 282 283  | Next Page >