Search Results

Search found 11391 results on 456 pages for 'alter column'.

Page 165/456 | < Previous Page | 161 162 163 164 165 166 167 168 169 170 171 172  | Next Page >

  • How to Load Oracle Tables From Hadoop Tutorial (Part 5 - Leveraging Parallelism in OSCH)

    - by Bob Hanckel
    Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 Using OSCH: Beyond Hello World In the previous post we discussed a “Hello World” example for OSCH focusing on the mechanics of getting a toy end-to-end example working. In this post we are going to talk about how to make it work for big data loads. We will explain how to optimize an OSCH external table for load, paying particular attention to Oracle’s DOP (degree of parallelism), the number of external table location files we use, and the number of HDFS files that make up the payload. We will provide some rules that serve as best practices when using OSCH. The assumption is that you have read the previous post and have some end to end OSCH external tables working and now you want to ramp up the size of the loads. Using OSCH External Tables for Access and Loading OSCH external tables are no different from any other Oracle external tables.  They can be used to access HDFS content using Oracle SQL: SELECT * FROM my_hdfs_external_table; or use the same SQL access to load a table in Oracle. INSERT INTO my_oracle_table SELECT * FROM my_hdfs_external_table; To speed up the load time, you will want to control the degree of parallelism (i.e. DOP) and add two SQL hints. ALTER SESSION FORCE PARALLEL DML PARALLEL  8; ALTER SESSION FORCE PARALLEL QUERY PARALLEL 8; INSERT /*+ append pq_distribute(my_oracle_table, none) */ INTO my_oracle_table SELECT * FROM my_hdfs_external_table; There are various ways of either hinting at what level of DOP you want to use.  The ALTER SESSION statements above force the issue assuming you (the user of the session) are allowed to assert the DOP (more on that in the next section).  Alternatively you could embed additional parallel hints directly into the INSERT and SELECT clause respectively. /*+ parallel(my_oracle_table,8) *//*+ parallel(my_hdfs_external_table,8) */ Note that the "append" hint lets you load a target table by reserving space above a given "high watermark" in storage and uses Direct Path load.  In other doesn't try to fill blocks that are already allocated and partially filled. It uses unallocated blocks.  It is an optimized way of loading a table without incurring the typical resource overhead associated with run-of-the-mill inserts.  The "pq_distribute" hint in this context unifies the INSERT and SELECT operators to make data flow during a load more efficient. Finally your target Oracle table should be defined with "NOLOGGING" and "PARALLEL" attributes.   The combination of the "NOLOGGING" and use of the "append" hint disables REDO logging, and its overhead.  The "PARALLEL" clause tells Oracle to try to use parallel execution when operating on the target table. Determine Your DOP It might feel natural to build your datasets in Hadoop, then afterwards figure out how to tune the OSCH external table definition, but you should start backwards. You should focus on Oracle database, specifically the DOP you want to use when loading (or accessing) HDFS content using external tables. The DOP in Oracle controls how many PQ slaves are launched in parallel when executing an external table. Typically the DOP is something you want to Oracle to control transparently, but for loading content from Hadoop with OSCH, it's something that you will want to control. Oracle computes the maximum DOP that can be used by an Oracle user. The maximum value that can be assigned is an integer value typically equal to the number of CPUs on your Oracle instances, times the number of cores per CPU, times the number of Oracle instances. For example, suppose you have a RAC environment with 2 Oracle instances. And suppose that each system has 2 CPUs with 32 cores. The maximum DOP would be 128 (i.e. 2*2*32). In point of fact if you are running on a production system, the maximum DOP you are allowed to use will be restricted by the Oracle DBA. This is because using a system maximum DOP can subsume all system resources on Oracle and starve anything else that is executing. Obviously on a production system where resources need to be shared 24x7, this can’t be allowed to happen. The use cases for being able to run OSCH with a maximum DOP are when you have exclusive access to all the resources on an Oracle system. This can be in situations when your are first seeding tables in a new Oracle database, or there is a time where normal activity in the production database can be safely taken off-line for a few hours to free up resources for a big incremental load. Using OSCH on high end machines (specifically Oracle Exadata and Oracle BDA cabled with Infiniband), this mode of operation can load up to 15TB per hour. The bottom line is that you should first figure out what DOP you will be allowed to run with by talking to the DBAs who manage the production system. You then use that number to derive the number of location files, and (optionally) the number of HDFS data files that you want to generate, assuming that is flexible. Rule 1: Find out the maximum DOP you will be allowed to use with OSCH on the target Oracle system Determining the Number of Location Files Let’s assume that the DBA told you that your maximum DOP was 8. You want the number of location files in your external table to be big enough to utilize all 8 PQ slaves, and you want them to represent equally balanced workloads. Remember location files in OSCH are metadata lists of HDFS files and are created using OSCH’s External Table tool. They also represent the workload size given to an individual Oracle PQ slave (i.e. a PQ slave is given one location file to process at a time, and only it will process the contents of the location file.) Rule 2: The size of the workload of a single location file (and the PQ slave that processes it) is the sum of the content size of the HDFS files it lists For example, if a location file lists 5 HDFS files which are each 100GB in size, the workload size for that location file is 500GB. The number of location files that you generate is something you control by providing a number as input to OSCH’s External Table tool. Rule 3: The number of location files chosen should be a small multiple of the DOP Each location file represents one workload for one PQ slave. So the goal is to keep all slaves busy and try to give them equivalent workloads. Obviously if you run with a DOP of 8 but have 5 location files, only five PQ slaves will have something to do and the other three will have nothing to do and will quietly exit. If you run with 9 location files, then the PQ slaves will pick up the first 8 location files, and assuming they have equal work loads, will finish up about the same time. But the first PQ slave to finish its job will then be rescheduled to process the ninth location file, potentially doubling the end to end processing time. So for this DOP using 8, 16, or 32 location files would be a good idea. Determining the Number of HDFS Files Let’s start with the next rule and then explain it: Rule 4: The number of HDFS files should try to be a multiple of the number of location files and try to be relatively the same size In our running example, the DOP is 8. This means that the number of location files should be a small multiple of 8. Remember that each location file represents a list of unique HDFS files to load, and that the sum of the files listed in each location file is a workload for one Oracle PQ slave. The OSCH External Table tool will look in an HDFS directory for a set of HDFS files to load.  It will generate N number of location files (where N is the value you gave to the tool). It will then try to divvy up the HDFS files and do its best to make sure the workload across location files is as balanced as possible. (The tool uses a greedy algorithm that grabs the biggest HDFS file and delegates it to a particular location file. It then looks for the next biggest file and puts in some other location file, and so on). The tools ability to balance is reduced if HDFS file sizes are grossly out of balance or are too few. For example suppose my DOP is 8 and the number of location files is 8. Suppose I have only 8 HDFS files, where one file is 900GB and the others are 100GB. When the tool tries to balance the load it will be forced to put the singleton 900GB into one location file, and put each of the 100GB files in the 7 remaining location files. The load balance skew is 9 to 1. One PQ slave will be working overtime, while the slacker PQ slaves are off enjoying happy hour. If however the total payload (1600 GB) were broken up into smaller HDFS files, the OSCH External Table tool would have an easier time generating a list where each workload for each location file is relatively the same.  Applying Rule 4 above to our DOP of 8, we could divide the workload into160 files that were approximately 10 GB in size.  For this scenario the OSCH External Table tool would populate each location file with 20 HDFS file references, and all location files would have similar workloads (approximately 200GB per location file.) As a rule, when the OSCH External Table tool has to deal with more and smaller files it will be able to create more balanced loads. How small should HDFS files get? Not so small that the HDFS open and close file overhead starts having a substantial impact. For our performance test system (Exadata/BDA with Infiniband), I compared three OSCH loads of 1 TiB. One load had 128 HDFS files living in 64 location files where each HDFS file was about 8GB. I then did the same load with 12800 files where each HDFS file was about 80MB size. The end to end load time was virtually the same. However when I got ridiculously small (i.e. 128000 files at about 8MB per file), it started to make an impact and slow down the load time. What happens if you break rules 3 or 4 above? Nothing draconian, everything will still function. You just won’t be taking full advantage of the generous DOP that was allocated to you by your friendly DBA. The key point of the rules articulated above is this: if you know that HDFS content is ultimately going to be loaded into Oracle using OSCH, it makes sense to chop them up into the right number of files roughly the same size, derived from the DOP that you expect to use for loading. Next Steps So far we have talked about OLH and OSCH as alternative models for loading. That’s not quite the whole story. They can be used together in a way that provides for more efficient OSCH loads and allows one to be more flexible about scheduling on a Hadoop cluster and an Oracle Database to perform load operations. The next lesson will talk about Oracle Data Pump files generated by OLH, and loaded using OSCH. It will also outline the pros and cons of using various load methods.  This will be followed up with a final tutorial lesson focusing on how to optimize OLH and OSCH for use on Oracle's engineered systems: specifically Exadata and the BDA. /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin;}

    Read the article

  • Can't Repair Mysql Table

    - by Pedro
    Hi, I have one table that I simply can't repair, I already try to remove the partitioning but still get this error: alter table promo_tool_view_44 REMOVE PARTITIONING; ERROR 1034 (HY000): Incorrect key file for table 'promo_tool_view_44'; try to repair it I already try to repair the table but I get this reply: repair table promo_tool_view_1; +-----------------------------+--------+----------+-----------------------------+ | Table | Op | Msg_type | Msg_text | +-----------------------------+--------+----------+-----------------------------+ | vad_stats.promo_tool_view_1 | repair | error | Partition p1 returned error | | vad_stats.promo_tool_view_1 | repair | error | Corrupt | +-----------------------------+--------+----------+-----------------------------+ 2 rows in set (0.21 sec) How can I solve this? Thanks, Pedro

    Read the article

  • Finder.app preview pane and QuickLook stretch some of my photos

    - by mcandre
    The Finder column view preview pane and QuickLook stretch many of my photos. But when I open the same photos in Preview.app, they look normal. Screenshot: For example, download this image (reaver.jpg), and view it with Finder's column view. Now view it with QuickLook. It renders correctly in every other application, so there's something going wrong in how QuickLook/Finder get the image dimensions. This problem started happening in either Mac OS X 10.8.1 or 10.8.2. Specs: Finder 10.8 QuickLook v4.0 (555.0) Mac OS X 10.8.2 MacBook Pro 2009 Also posted in Apple Discussions.

    Read the article

  • Binary Log Format in MySQL

    - by amritansu
    Reference manual for MySQL 5.6 states that " Some changes, however, still use the statement-based format. Examples include all DDL (data definition language) statements such as CREATE TABLE, ALTER TABLE, or DROP TABLE. " Does this statement means that even if we have ROW format for binary logs all DDLs will be logged in binary log as statement based? How does this affect replication? Kindly help me to understand this.

    Read the article

  • Vmware Player Network adaptors have no network or internet access in windows 7 enterprise

    - by daffers
    As per the title. My VMWare player installation has setup the two network adaptor VMnet1 and VMnet8 and they are picked up as unidentified networks with no network access (i need this to activate my windows server installation on it). The option to change the network location is not available (this might be because of network policy on the domain despite having set this as configurable in the local security policy section). Is there anyway i can change how these networks are detected or alter the configuration of vmware to get around this?

    Read the article

  • VLOOKUP and match functions appear to be searching the function rather than value

    - by Brandon S.
    Vlookup and match seem to be searching based on the function I have in my cell rather than the value i have in the cell. I have a column with dates, (ex: C2, which has the formula =E2&"/"&F2&"/"&D2 in them, for example). (where E2, F2, D2 are the year, month, and date). In another sheet and column, I have a bunch of dates, and i'm using the formula =VLOOKUP(C2,'sheet2'!A1:B252,2,FALSE), which doesn't work. (returns #N/A) If I replace C2 with the same date, but without the formula (just typing it in), VLOOKUP works. Why is this?

    Read the article

  • What permissions do I need to run SQL*Loader?

    - by Jason Baker
    What permissions does a database user need to be able to run oracle's sql loader? For instance, since sql loader will disable indexes and triggers, does it need ALTER permissions for those items? This seems like a simple question, but I can't find any documentation on this in the manual.

    Read the article

  • Liferay and Oracle DB

    - by iamedu
    Hi! I'm installing liferay community edition with an Oracle database, I managed to get it running with the user SYSTEM, but I don't like this... I want to create another user in another tablespace, the problem is that it seems that liferay needs to create tables and alter them according during its lifetime. Do you know what permission and roles need to be assigned to the user? Thanks a lot in advance.

    Read the article

  • MS Word 2007 Mail Merge fails on ZIP codes with leading Zeros (eg. 01234)

    - by Pretzel
    I have an Excel Spreadsheet with a ZIP code column. For some dumb reason the original spreadsheet I got had all the zip codes stored as numbers, so a ZIP code like 01234 was stored as 1234. Easy to fix with "Format Column" as "Special = ZIP Code". All values like 1234, show up as 01234. Great! When I import it into Word via Mail Merge (to print address labels), the ZIP codes on all the addresses starting with a leading zero (like 01234) revert to their old form (1234). How do I fix this?

    Read the article

  • Deleted user default database - SQL Server 2008

    - by RadiantHex
    Hi folks, I cannot connect to the database anymore, I'm getting: Cannot open user default database. Login failed. I have deleted the database during a previous session and then tried to recreate it. But the recreate failed. Now I am stuck with this error, what can I do? Edit: I'm using Windows Authentication Any ideas? Fixed: use the command: sqlcmd -E -d master then type: ALTER LOGIN [Your Windows Login] WITH DEFAULT_DATABASE=master GO :)

    Read the article

  • Excel 2010 filter arrow not showing text values

    - by DVP
    I have an odd problem on a tracker spreadsheet I use. All the columns have a filter, but when you click on the filter arrow it doesn't show you a breakdown of all the text values for that column. All it shows is the usual 'sort A to Z/Z to A', but the bottom half of the pop-up screen is blank, where normally you have a list of text values that you can further filter by putting a tick next to each. It only displays (Select All) which you can tick, but its pointless as the column has selected all text values and hasn't been further filtered, which is what I need to do.

    Read the article

  • Automatically keeping two excel data tables in-sync (w/out VBA)

    - by Neil
    I'm putting together a workbook for tracking a stock portfolio. The primary sheet contains a table with the list of the transactions. From this I would like to create an overview table on another sheet with only one row per unique stock symbol that includes things like cost basis, returns, etc. The problem is that nothing I've tried updates the overview table correctly when rows are added to the transaction table. The closest I've got is something like the following: http://www.get-digital-help.com/2009/04/14/create-a-unique-alphabetically-sorted-list-extracted-from-a-column/ However, this requires applying that formula to every cell in the primary column of the overview sheet. And even then the range of the table isn't extended down to include new rows as they become valid. Essentially I'm looking for a way that auto-adds rows to a table and copies the formula based on a different table changing without using VBA. Trivial example data Sheet1 Symbol Type Shares Price F Buy 100 12 MSFT Buy 100 25 MSFT Sell 50 28 F Buy 100 16 Sheet2 Symbol Quantity F 200 MSFT 50

    Read the article

  • Excel 2010: if( , , "") not treated the same as blank for pivot table group by date

    - by Confused
    I'm trying to group by date in an Excel 2010 pivot table. The column with dates (i.e., the one want to group by), should be the latest date of 2 other columns if neither is null, or blank. i.e., with a formula like: =IF(AND(A4 <> "", B4 <> ""), MAX(A4,B4), "") Normally, this ""in the IF() formula acts the same as an empty cell. In this case, it is preventing me from grouping by date in the Pivot Table. If I filter the date column by (Blanks), then clear the contents of all those cells, then the pivot table does group by date ok. i.e., "" is not being treated the same as an empty cell.

    Read the article

  • Where is the "fix" button on Windows 7 Photo Viewer?

    - by Edward Tanguay
    When I click on a photo in Vista I get a fix button: When I click on a photo in Windows 7 I don't get a fix button: I used to use this fix button often to fix red-eye and alter the brightness quickly, and the auto-saving was so fast. If I can't get "Windows Photo Viewer" to do this, I'll have to http://www.irfanview.com again to do this.

    Read the article

  • Retrieving a specific value from “df -h” using shell

    - by diegodias
    When I use df -h, I get the following output: Filesystem Size Used Avail Use% Mounted on /dev/mapper/VolGroup00-LogVol00 59G 2.2G 54G 4% / /dev/sda1 122M 38M 78M 33% /boot tmpfs 1.1G 0 1.1G 0% /dev/shm 10.10.0.105:/somepath 11T 8.4T 2.1T 81% /storage4 10.11.0.101:/somepath 15T 8.9T 5.9T 61% /storage1 /dev/mapper/patha 5.0T 255G 4.8T 5% /storage5_vol0 /dev/mapper/pathb 5.0T 195G 4.9T 4% /storage5_vol1 /dev/mapper/pathc 5.0T 608G 4.5T 12% /storage5_vol2 I want to write a script that gets the value of Avail column on a specific storage. I used to use df -k /storage_name | tail -1 | awk '{print $3}' But the FileSystem column can have a value or not .. which would change the variable of my script from $3 to $4. How can I get the Avail on a single command line even if there are no values on the previous columns?

    Read the article

  • Change collation of a MySQL table to utf8_general_cs

    - by jack
    I tried to change collation MySQL table to utf8_general_cs but got following error: mysql> ALTER TABLE table_name CONVERT TO CHARACTER SET utf8 COLLATE 'utf8_general_cs'; ERROR 1273 (HY000): Unknown collation: 'utf8_general_cs' I run "SHOW COLLATE" command and "utf8_general_cs" is not in the results. What can I do now?

    Read the article

  • Vmware Player network adapters have no network or internet access in Windows 7 enterprise [closed]

    - by daffers
    As per the title. My VMWare player installation has setup the two network adaptor VMnet1 and VMnet8 and they are picked up as unidentified networks with no network access (i need this to activate my windows server installation on it). The option to change the network location is not available (this might be because of network policy on the domain despite having set this as configurable in the local security policy section). Is there anyway i can change how these networks are detected or alter the configuration of vmware to get around this?

    Read the article

  • Conditionally format row based on cell value in Excel 2011 Mac

    - by kojiro
    I'm using Excel Mac 2011. I have read some of the other answers, but this question is different because I want to apply conditional formatting to an entire row when its cell in column B contains the value 'Y'. Simple conditional formatting just formats that one cell. Whenever the field at column B for any given row contains the value 'Y', I'd like to format that row. Using Mac Excel's so-called "classic" conditional formatting, I have this: I would really like to apply that to every row, but it just paints the entire sheet red (because $B$3 contains "Y"). I can't seem to figure out how to get the reference to whatever is in field B for this row in the rule.

    Read the article

  • Will this force a reinitialize in Merge Replication Topology?

    - by Refracted Paladin
    I need to add a couple of columns to a table that is a part of a replication set. It is not a constraint coulumn or a part of any article filters and it allows NULL. I have a pretty good idea that I can run this -- ALTER TABLE tblPlanDomain ADD ReportWageES VARCHAR (100) NULL and NOT force all my clients to reinitialize but I was hoping for some reassurance. Can anyone verify this one way or the other for me? Thanks,

    Read the article

  • conditional formatting for subsequent rows or columns

    - by Trailokya Saikia
    I have data in a range of cells (say six columns and one hundred rows). The first four column contains data and the sixth column has a limiting value. For data in every row the limiting value is different. I have one hundred such rows. I am successfully using Conditional formatting (e.g. cells containing data less than limiting value in first five columns are made red) for 1st row. But how to copy this conditional formatting so that it is applicable for entire hundred rows with respective limiting values. I tried with format painter. But it retains the same source cell (here limiting value) for the purpose of conditional formatting in second and subsequent rows. So, now I am required to use conditional formatting for each row separately s

    Read the article

  • How to remove trailing spaces from SQL Server logical filename?

    - by Luke Girvin
    I'm dealing with a server running SQL Server 2000 SP1, and the logical filenames for one of the databases appear to contain trailing spaces. That is, this query: select replace(name, ' ', 'X') from sysfiles Returns the expected names plus a long string of Xs. How can I deal with this? I've tried running ALTER DATABASE... MODIFY FILE using the name (with and without spaces) and get an error message telling me the file does not exist.

    Read the article

  • How do I keep Conditional Formatting formulas and ranges from automatically changing?

    - by Iszi
    I've found that Conditional Formatting formulas and ranges will automatically adjust when you copy, delete, or move data around in a spreadsheet. While this is a nice idea, it tends to break things for me in some rather weird ways. To avoid this, I tried writing rules that applied to the entire spreadsheet and keyed off of column headers to highlight the data I wanted to check. Example: =AND(A$1="Check This Column For Blanks),ISBLANK(A1)) applied to =$1:$1048576 However, even with the rule explicitly applied to the entire sheet, it was still automatically adjusting (and breaking in weird ways by doing so) as I worked in the sheet. How can I avoid this?

    Read the article

  • How do I rename the Tfs_xxxConfiguration database for TFS 2010?

    - by Jaxidian
    When we installed TFS 2010, we made some poor decisions in naming our databases. I have figured out how to rename everything except for the Tfs_xxxConfiguration database. How can I get this renamed? I can only find a read-only display of the connection string to it - cannot find where to alter that connection string either directly or indirectly. I'd really prefer to not have to rebuild it from scratch... again... Thanks!

    Read the article

  • Importing from CSV and sorting by Date

    - by Andrew Rice
    I have the following script that parses an HR output file looking for employees and outputs information such as Hire Dare, First Name, Last Name, Supervisor etc. The problem I have is that in the current format I think the Hire Date column is being treated as a string so in effect it orders the output by month (i.e. 1/1/01 comes before 2/2/98). Is there a way to map that column to a date/time so it sorts properly? Import-CSV -delimiter "`t" Output.tab | Where-Object {$_.'First Nae' -like '*And*'} | Sort-Object 'Hire Date' | ft 'Hire Date', 'First Name'

    Read the article

< Previous Page | 161 162 163 164 165 166 167 168 169 170 171 172  | Next Page >