Search Results

Search found 3179 results on 128 pages for 'merge replication'.

Page 1/128 | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >

  • Coherence - How to develop a custom push replication publisher

    - by cosmin.tudor(at)oracle.com
    CoherencePushReplicationDB.zipIn the example bellow I'm describing a way of developing a custom push replication publisher that publishes data to a database via JDBC. This example can be easily changed to publish data to other receivers (JMS,...) by performing changes to step 2 and small changes to step 3, steps that are presented bellow. I've used Eclipse as the development tool. To develop a custom push replication publishers we will need to go through 6 steps: Step 1: Create a custom publisher scheme class Step 2: Create a custom publisher class that should define what the publisher is doing. Step 3: Create a class data is performing the actions (publish to JMS, DB, etc ) for the custom publisher. Step 4: Register the new publisher against a ContentHandler. Step 5: Add the new custom publisher in the cache configuration file. Step 6: Add the custom publisher scheme class to the POF configuration file. All these steps are detailed bellow. The coherence project is attached and conclusions are presented at the end. Step 1: In the Coherence Eclipse project create a class called CustomPublisherScheme that should implement com.oracle.coherence.patterns.pushreplication.publishers.AbstractPublisherScheme. In this class define the elements of the custom-publisher-scheme element. For instance for a CustomPublisherScheme that looks like that: <sync:publisher> <sync:publisher-name>Active2-JDBC-Publisher</sync:publisher-name> <sync:publisher-scheme> <sync:custom-publisher-scheme> <sync:jdbc-string>jdbc:oracle:thin:@machine-name:1521:XE</sync:jdbc-string> <sync:username>hr</sync:username> <sync:password>hr</sync:password> </sync:custom-publisher-scheme> </sync:publisher-scheme> </sync:publisher> the code is: package com.oracle.coherence; import java.io.DataInput; import java.io.DataOutput; import java.io.IOException; import com.oracle.coherence.patterns.pushreplication.Publisher; import com.oracle.coherence.configuration.Configurable; import com.oracle.coherence.configuration.Mandatory; import com.oracle.coherence.configuration.Property; import com.oracle.coherence.configuration.parameters.ParameterScope; import com.oracle.coherence.environment.Environment; import com.tangosol.io.pof.PofReader; import com.tangosol.io.pof.PofWriter; import com.tangosol.util.ExternalizableHelper; @Configurable public class CustomPublisherScheme extends com.oracle.coherence.patterns.pushreplication.publishers.AbstractPublisherScheme { /** * */ private static final long serialVersionUID = 1L; private String jdbcString; private String username; private String password; public String getJdbcString() { return this.jdbcString; } @Property("jdbc-string") @Mandatory public void setJdbcString(String jdbcString) { this.jdbcString = jdbcString; } public String getUsername() { return username; } @Property("username") @Mandatory public void setUsername(String username) { this.username = username; } public String getPassword() { return password; } @Property("password") @Mandatory public void setPassword(String password) { this.password = password; } public Publisher realize(Environment environment, ClassLoader classLoader, ParameterScope parameterScope) { return new CustomPublisher(getJdbcString(), getUsername(), getPassword()); } public void readExternal(DataInput in) throws IOException { super.readExternal(in); this.jdbcString = ExternalizableHelper.readSafeUTF(in); this.username = ExternalizableHelper.readSafeUTF(in); this.password = ExternalizableHelper.readSafeUTF(in); } public void writeExternal(DataOutput out) throws IOException { super.writeExternal(out); ExternalizableHelper.writeSafeUTF(out, this.jdbcString); ExternalizableHelper.writeSafeUTF(out, this.username); ExternalizableHelper.writeSafeUTF(out, this.password); } public void readExternal(PofReader reader) throws IOException { super.readExternal(reader); this.jdbcString = reader.readString(100); this.username = reader.readString(101); this.password = reader.readString(102); } public void writeExternal(PofWriter writer) throws IOException { super.writeExternal(writer); writer.writeString(100, this.jdbcString); writer.writeString(101, this.username); writer.writeString(102, this.password); } } Step 2: Define what the CustomPublisher should basically do by creating a new java class called CustomPublisher that implements com.oracle.coherence.patterns.pushreplication.Publisher package com.oracle.coherence; import com.oracle.coherence.patterns.pushreplication.EntryOperation; import com.oracle.coherence.patterns.pushreplication.Publisher; import com.oracle.coherence.patterns.pushreplication.exceptions.PublisherNotReadyException; import java.io.BufferedWriter; import java.util.Iterator; public class CustomPublisher implements Publisher { private String jdbcString; private String username; private String password; private transient BufferedWriter bufferedWriter; public CustomPublisher() { } public CustomPublisher(String jdbcString, String username, String password) { this.jdbcString = jdbcString; this.username = username; this.password = password; this.bufferedWriter = null; } public String getJdbcString() { return this.jdbcString; } public String getUsername() { return username; } public String getPassword() { return password; } public void publishBatch(String cacheName, String publisherName, Iterator<EntryOperation> entryOperations) { DatabasePersistence databasePersistence = new DatabasePersistence( jdbcString, username, password); while (entryOperations.hasNext()) { EntryOperation entryOperation = (EntryOperation) entryOperations .next(); databasePersistence.databasePersist(entryOperation); } } public void start(String cacheName, String publisherName) throws PublisherNotReadyException { System.err .printf("Started: Custom JDBC Publisher for Cache %s with Publisher %s\n", new Object[] { cacheName, publisherName }); } public void stop(String cacheName, String publisherName) { System.err .printf("Stopped: Custom JDBC Publisher for Cache %s with Publisher %s\n", new Object[] { cacheName, publisherName }); } } In the publishBatch method from above we inform the publisher that he is supposed to persist data to a database: DatabasePersistence databasePersistence = new DatabasePersistence( jdbcString, username, password); while (entryOperations.hasNext()) { EntryOperation entryOperation = (EntryOperation) entryOperations .next(); databasePersistence.databasePersist(entryOperation); } Step 3: The class that deals with the persistence is a very basic one that uses JDBC to perform inserts/updates against a database. package com.oracle.coherence; import com.oracle.coherence.patterns.pushreplication.EntryOperation; import java.sql.*; import java.text.SimpleDateFormat; import com.oracle.coherence.Order; public class DatabasePersistence { public static String INSERT_OPERATION = "INSERT"; public static String UPDATE_OPERATION = "UPDATE"; public Connection dbConnection; public DatabasePersistence(String jdbcString, String username, String password) { this.dbConnection = createConnection(jdbcString, username, password); } public Connection createConnection(String jdbcString, String username, String password) { Connection connection = null; System.err.println("Connecting to: " + jdbcString + " Username: " + username + " Password: " + password); try { // Load the JDBC driver String driverName = "oracle.jdbc.driver.OracleDriver"; Class.forName(driverName); // Create a connection to the database connection = DriverManager.getConnection(jdbcString, username, password); System.err.println("Connected to:" + jdbcString + " Username: " + username + " Password: " + password); } catch (ClassNotFoundException e) { e.printStackTrace(); } // driver catch (SQLException e) { e.printStackTrace(); } return connection; } public void databasePersist(EntryOperation entryOperation) { if (entryOperation.getOperation().toString() .equalsIgnoreCase(INSERT_OPERATION)) { insert(((Order) entryOperation.getPublishableEntry().getValue())); } else if (entryOperation.getOperation().toString() .equalsIgnoreCase(UPDATE_OPERATION)) { update(((Order) entryOperation.getPublishableEntry().getValue())); } } public void update(Order order) { String update = "UPDATE Orders set QUANTITY= '" + order.getQuantity() + "', AMOUNT='" + order.getAmount() + "', ORD_DATE= '" + (new SimpleDateFormat("dd-MMM-yyyy")).format(order .getOrdDate()) + "' WHERE SYMBOL='" + order.getSymbol() + "'"; System.err.println("UPDATE = " + update); try { Statement stmt = getDbConnection().createStatement(); stmt.execute(update); stmt.close(); } catch (SQLException ex) { System.err.println("SQLException: " + ex.getMessage()); } } public void insert(Order order) { String insert = "insert into Orders values('" + order.getSymbol() + "'," + order.getQuantity() + "," + order.getAmount() + ",'" + (new SimpleDateFormat("dd-MMM-yyyy")).format(order .getOrdDate()) + "')"; System.err.println("INSERT = " + insert); try { Statement stmt = getDbConnection().createStatement(); stmt.execute(insert); stmt.close(); } catch (SQLException ex) { System.err.println("SQLException: " + ex.getMessage()); } } public Connection getDbConnection() { return dbConnection; } public void setDbConnection(Connection dbConnection) { this.dbConnection = dbConnection; } } Step 4: Now we need to register our publisher against a ContentHandler. In order to achieve that we need to create in our eclipse project a new class called CustomPushReplicationNamespaceContentHandler that should extend the com.oracle.coherence.patterns.pushreplication.configuration.PushReplicationNamespaceContentHandler. In the constructor of the new class we define a new handler for our custom publisher. package com.oracle.coherence; import com.oracle.coherence.configuration.Configurator; import com.oracle.coherence.environment.extensible.ConfigurationContext; import com.oracle.coherence.environment.extensible.ConfigurationException; import com.oracle.coherence.environment.extensible.ElementContentHandler; import com.oracle.coherence.patterns.pushreplication.PublisherScheme; import com.oracle.coherence.environment.extensible.QualifiedName; import com.oracle.coherence.patterns.pushreplication.configuration.PushReplicationNamespaceContentHandler; import com.tangosol.run.xml.XmlElement; public class CustomPushReplicationNamespaceContentHandler extends PushReplicationNamespaceContentHandler { public CustomPushReplicationNamespaceContentHandler() { super(); registerContentHandler("custom-publisher-scheme", new ElementContentHandler() { public Object onElement(ConfigurationContext context, QualifiedName qualifiedName, XmlElement xmlElement) throws ConfigurationException { PublisherScheme publisherScheme = new CustomPublisherScheme(); Configurator.configure(publisherScheme, context, qualifiedName, xmlElement); return publisherScheme; } }); } } Step 5: Now we should define our CustomPublisher in the cache configuration file according to the following documentation. <cache-config xmlns:sync="class:com.oracle.coherence.CustomPushReplicationNamespaceContentHandler" xmlns:cr="class:com.oracle.coherence.environment.extensible.namespaces.InstanceNamespaceContentHandler"> <caching-schemes> <sync:provider pof-enabled="false"> <sync:coherence-provider /> </sync:provider> <caching-scheme-mapping> <cache-mapping> <cache-name>publishing-cache</cache-name> <scheme-name>distributed-scheme-with-publishing-cachestore</scheme-name> <autostart>true</autostart> <sync:publisher> <sync:publisher-name>Active2 Publisher</sync:publisher-name> <sync:publisher-scheme> <sync:remote-cluster-publisher-scheme> <sync:remote-invocation-service-name>remote-site1</sync:remote-invocation-service-name> <sync:remote-publisher-scheme> <sync:local-cache-publisher-scheme> <sync:target-cache-name>publishing-cache</sync:target-cache-name> </sync:local-cache-publisher-scheme> </sync:remote-publisher-scheme> <sync:autostart>true</sync:autostart> </sync:remote-cluster-publisher-scheme> </sync:publisher-scheme> </sync:publisher> <sync:publisher> <sync:publisher-name>Active2-Output-Publisher</sync:publisher-name> <sync:publisher-scheme> <sync:stderr-publisher-scheme> <sync:autostart>true</sync:autostart> <sync:publish-original-value>true</sync:publish-original-value> </sync:stderr-publisher-scheme> </sync:publisher-scheme> </sync:publisher> <sync:publisher> <sync:publisher-name>Active2-JDBC-Publisher</sync:publisher-name> <sync:publisher-scheme> <sync:custom-publisher-scheme> <sync:jdbc-string>jdbc:oracle:thin:@machine_name:1521:XE</sync:jdbc-string> <sync:username>hr</sync:username> <sync:password>hr</sync:password> </sync:custom-publisher-scheme> </sync:publisher-scheme> </sync:publisher> </cache-mapping> </caching-scheme-mapping> <!-- The following scheme is required for each remote-site when using a RemoteInvocationPublisher --> <remote-invocation-scheme> <service-name>remote-site1</service-name> <initiator-config> <tcp-initiator> <remote-addresses> <socket-address> <address>localhost</address> <port>20001</port> </socket-address> </remote-addresses> <connect-timeout>2s</connect-timeout> </tcp-initiator> <outgoing-message-handler> <request-timeout>5s</request-timeout> </outgoing-message-handler> </initiator-config> </remote-invocation-scheme> <!-- END: com.oracle.coherence.patterns.pushreplication --> <proxy-scheme> <service-name>ExtendTcpProxyService</service-name> <acceptor-config> <tcp-acceptor> <local-address> <address>localhost</address> <port>20002</port> </local-address> </tcp-acceptor> </acceptor-config> <autostart>true</autostart> </proxy-scheme> </caching-schemes> </cache-config> As you can see in the red-marked text from above I've:       - set new Namespace Content Handler       - define the new custom publisher that should work together with other publishers like: stderr and remote publishers in our case. Step 6: Add the com.oracle.coherence.CustomPublisherScheme to your custom-pof-config file: <pof-config> <user-type-list> <!-- Built in types --> <include>coherence-pof-config.xml</include> <include>coherence-common-pof-config.xml</include> <include>coherence-messagingpattern-pof-config.xml</include> <include>coherence-pushreplicationpattern-pof-config.xml</include> <!-- Application types --> <user-type> <type-id>1901</type-id> <class-name>com.oracle.coherence.Order</class-name> <serializer> <class-name>com.oracle.coherence.OrderSerializer</class-name> </serializer> </user-type> <user-type> <type-id>1902</type-id> <class-name>com.oracle.coherence.CustomPublisherScheme</class-name> </user-type> </user-type-list> </pof-config> CONCLUSIONSThis approach allows for publishers to publish data to almost any other receiver (database, JMS, MQ, ...). The only thing that needs to be changed is the DatabasePersistence.java class that should be adapted to the chosen receiver. Only minor changes are needed for the rest of the code (to publishBatch method from CustomPublisher class).

    Read the article

  • Is there any replication standard or concept for application server data replication

    - by Naga
    Hi friends, Say, I have a server that handles file based mass data and can process thousands of read requests and hundreds of provisioning requests(Add, modify, delete) per second. This is not SQL based database. Now i planned to implement replication. There should be master- master replication, master slave replication, partial replication(entries matching a criteria) and fractional replication(part of an entry). Is there any standard for implementing such replication mechanisms? Can you please suggest some solution overview? Thanks, Naga

    Read the article

  • Replication Services in a BI environment

    - by jorg
    In this blog post I will explain the principles of SQL Server Replication Services without too much detail and I will take a look on the BI capabilities that Replication Services could offer in my opinion. SQL Server Replication Services provides tools to copy and distribute database objects from one database system to another and maintain consistency afterwards. These tools basically copy or synchronize data with little or no transformations, they do not offer capabilities to transform data or apply business rules, like ETL tools do. The only “transformations” Replication Services offers is to filter records or columns out of your data set. You can achieve this by selecting the desired columns of a table and/or by using WHERE statements like this: SELECT <published_columns> FROM [Table] WHERE [DateTime] >= getdate() - 60 There are three types of replication: Transactional Replication This type replicates data on a transactional level. The Log Reader Agent reads directly on the transaction log of the source database (Publisher) and clones the transactions to the Distribution Database (Distributor), this database acts as a queue for the destination database (Subscriber). Next, the Distribution Agent moves the cloned transactions that are stored in the Distribution Database to the Subscriber. The Distribution Agent can either run at scheduled intervals or continuously which offers near real-time replication of data! So for example when a user executes an UPDATE statement on one or multiple records in the publisher database, this transaction (not the data itself) is copied to the distribution database and is then also executed on the subscriber. When the Distribution Agent is set to run continuously this process runs all the time and transactions on the publisher are replicated in small batches (near real-time), when it runs on scheduled intervals it executes larger batches of transactions, but the idea is the same. Snapshot Replication This type of replication makes an initial copy of database objects that need to be replicated, this includes the schemas and the data itself. All types of replication must start with a snapshot of the database objects from the Publisher to initialize the Subscriber. Transactional replication need an initial snapshot of the replicated publisher tables/objects to run its cloned transactions on and maintain consistency. The Snapshot Agent copies the schemas of the tables that will be replicated to files that will be stored in the Snapshot Folder which is a normal folder on the file system. When all the schemas are ready, the data itself will be copied from the Publisher to the snapshot folder. The snapshot is generated as a set of bulk copy program (BCP) files. Next, the Distribution Agent moves the snapshot to the Subscriber, if necessary it applies schema changes first and copies the data itself afterwards. The application of schema changes to the Subscriber is a nice feature, when you change the schema of the Publisher with, for example, an ALTER TABLE statement, that change is propagated by default to the Subscriber(s). Merge Replication Merge replication is typically used in server-to-client environments, for example when subscribers need to receive data, make changes offline, and later synchronize changes with the Publisher and other Subscribers, like with mobile devices that need to synchronize one in a while. Because I don’t really see BI capabilities here, I will not explain this type of replication any further. Replication Services in a BI environment Transactional Replication can be very useful in BI environments. In my opinion you never want to see users to run custom (SSRS) reports or PowerPivot solutions directly on your production database, it can slow down the system and can cause deadlocks in the database which can cause errors. Transactional Replication can offer a read-only, near real-time database for reporting purposes with minimal overhead on the source system. Snapshot Replication can also be useful in BI environments, if you don’t need a near real-time copy of the database, you can choose to use this form of replication. Next to an alternative for Transactional Replication it can be used to stage data so it can be transformed and moved into the data warehousing environment afterwards. In many solutions I have seen developers create multiple SSIS packages that simply copies data from one or more source systems to a staging database that figures as source for the ETL process. The creation of these packages takes a lot of (boring) time, while Replication Services can do the same in minutes. It is possible to filter out columns and/or records and it can even apply schema changes automatically so I think it offers enough features here. I don’t know how the performance will be and if it really works as good for this purpose as I expect, but I want to try this out soon!

    Read the article

  • re-enabling a table for mysql replication

    - by jessieE
    We were able to setup mysql master-slave replication with the following version on both master/slave: mysqld Ver 5.5.28-29.1-log for Linux on x86_64 (Percona Server (GPL), Release 29.1) One day, we noticed that replication has stopped, we tried skipping over the entries that caused the replication errors. The errors persisted so we decided to skip replication for the 4 problematic tables. The slave has now caught up with the master except for the 4 tables. What is the best way to enable replication again for the 4 tables? This is what I have in mind but I don't know if it will work: 1) Modify slave config to enable replication again for the 4 tables 2) stop slave replication 3) for each of the 4 tables, use pt-table-sync --execute --verbose --print --sync-to-master h=localhost,D=mydb,t=mytable 4) restart slave database to reload replication configuration 5) start slave replication

    Read the article

  • Manage and Monitor Identity Ranges in SQL Server Transactional Replication

    - by Yaniv Etrogi
    Problem When using transactional replication to replicate data in a one way topology from a publisher to a read-only subscriber(s) there is no need to manage identity ranges. However, when using  transactional replication to replicate data in a two way replication topology - between two or more servers there is a need to manage identity ranges in order to prevent a situation where an INSERT commands fails on a PRIMARY KEY violation error  due to the replicated row being inserted having a value for the identity column which already exists at the destination database. Solution There are two ways to address this situation: Assign a range of identity values per each server. Work with parallel identity values. The first method requires some maintenance while the second method does not and so the scripts provided with this article are very useful for anyone using the first method. I will explore this in more detail later in the article. In the first solution set server1 to work in the range of 1 to 1,000,000,000 and server2 to work in the range of 1,000,000,001 to 2,000,000,000.  The ranges are set and defined using the DBCC CHECKIDENT command and when the ranges in this example are well maintained you meet the goal of preventing the INSERT commands to fall due to a PRIMARY KEY violation. The first insert at server1 will get the identity value of 1, the second insert will get the value of 2 and so on while on server2 the first insert will get the identity value of 1000000001, the second insert 1000000002 and so on thus avoiding a conflict. Be aware that when a row is inserted the identity value (seed) is generated as part of the insert command at each server and the inserted row is replicated. The replicated row includes the identity column’s value so the data remains consistent across all servers but you will be able to tell on what server the original insert took place due the range that  the identity value belongs to. In the second solution you do not manage ranges but enforce a situation in which identity values can never get overlapped by setting the first identity value (seed) and the increment property one time only during the CREATE TABLE command of each table. So a table on server1 looks like this: CREATE TABLE T1 (  c1 int NOT NULL IDENTITY(1, 5) PRIMARY KEY CLUSTERED ,c2 int NOT NULL ); And a table on server2 looks like this: CREATE TABLE T1(  c1 int NOT NULL IDENTITY(2, 5) PRIMARY KEY CLUSTERED ,c2 int NOT NULL ); When these two tables are inserted the results of the identity values look like this: Server1:  1, 6, 11, 16, 21, 26… Server2:  2, 7, 12, 17, 22, 27… This assures no identity values conflicts while leaving a room for 3 additional servers to participate in this same environment. You can go up to 9 servers using this method by setting an increment value of 9 instead of 5 as I used in this example. Continues…

    Read the article

  • mysql replication 1x master, 1x slave

    - by clarkk
    I have just setup one master and one slave server, but its not working.. On my website I connect to the slave server and I insert some rows, but they do not appear on the master and vice versa.. What is wrong? This is what I did: Master: -> /etc/mysql/my.cnf [mysqld] log-bin = mysql-master-bin server-id=1 # bind-address = 127.0.0.1 binlog-do-db = test_db Slave: -> /etc/mysql/my.cnf [mysqld] log-bin = mysql-slave-bin server-id=2 # bind-address = 127.0.0.1 replicate-do-db = test_db Slave: terminal 0 > mysql> STOP SLAVE; // and drop tables Master: terminal 1 > mysql> CREATE USER 'repl_slave'@'slave_ip' IDENTIFIED BY 'repl_pass'; mysql> GRANT REPLICATION SLAVE ON *.* TO 'repl_slave'@'slave_ip'; mysql> FLUSH PRIVILEGES; mysql> FLUSH TABLES WITH READ LOCK; -- leave terminal open terminal 2 > shell> mysqldump -u root -pPASSWORD test_db --lock-all-tables > dump.sql mysql> SHOW MASTER STATUS; Slave: terminal 3 > shell> mysql -u root -pPASSWORD test_db < dump.sql terminal 0 > mysql> CHANGE MASTER TO mysql> MASTER_HOST='master_ip', mysql> MASTER_USER='repl_slave', mysql> MASTER_PASSWORD='repl_pass', mysql> MASTER_PORT=3306, mysql> MASTER_LOG_FILE='mysql-master-bin.000003', // terminal 2 > SHOW MASTER STATUS mysql> MASTER_LOG_POS=4, // terminal 2 > SHOW MASTER STATUS mysql> MASTER_CONNECT_RETRY=10; mysql> START SLAVE; mysql> SHOW SLAVE STATUS; Here is the slave status: Array ( [Slave_IO_State] => Waiting for master to send event [Master_Host] => xx.xx.xx.xx [Master_User] => repl_slave [Master_Port] => 3306 [Connect_Retry] => 10 [Master_Log_File] => mysql-master-bin.000003 [Read_Master_Log_Pos] => 106 [Relay_Log_File] => mysqld-relay-bin.000002 [Relay_Log_Pos] => 258 [Relay_Master_Log_File] => mysql-master-bin.000003 [Slave_IO_Running] => Yes [Slave_SQL_Running] => Yes [Replicate_Do_DB] => test_db [Replicate_Ignore_DB] => [Replicate_Do_Table] => [Replicate_Ignore_Table] => [Replicate_Wild_Do_Table] => [Replicate_Wild_Ignore_Table] => [Last_Errno] => 0 [Last_Error] => [Skip_Counter] => 0 [Exec_Master_Log_Pos] => 106 [Relay_Log_Space] => 414 [Until_Condition] => None [Until_Log_File] => [Until_Log_Pos] => 0 [Master_SSL_Allowed] => No [Master_SSL_CA_File] => [Master_SSL_CA_Path] => [Master_SSL_Cert] => [Master_SSL_Cipher] => [Master_SSL_Key] => [Seconds_Behind_Master] => 0 [Master_SSL_Verify_Server_Cert] => No [Last_IO_Errno] => 0 [Last_IO_Error] => [Last_SQL_Errno] => 0 [Last_SQL_Error] => )

    Read the article

  • Replication Services as ETL extraction tool

    - by jorg
    In my last blog post I explained the principles of Replication Services and the possibilities it offers in a BI environment. One of the possibilities I described was the use of snapshot replication as an ETL extraction tool: “Snapshot Replication can also be useful in BI environments, if you don’t need a near real-time copy of the database, you can choose to use this form of replication. Next to an alternative for Transactional Replication it can be used to stage data so it can be transformed and moved into the data warehousing environment afterwards. In many solutions I have seen developers create multiple SSIS packages that simply copies data from one or more source systems to a staging database that figures as source for the ETL process. The creation of these packages takes a lot of (boring) time, while Replication Services can do the same in minutes. It is possible to filter out columns and/or records and it can even apply schema changes automatically so I think it offers enough features here. I don’t know how the performance will be and if it really works as good for this purpose as I expect, but I want to try this out soon!” Well I have tried it out and I must say it worked well. I was able to let replication services do work in a fraction of the time it would cost me to do the same in SSIS. What I did was the following: Configure snapshot replication for some Adventure Works tables, this was quite simple and straightforward. Create an SSIS package that executes the snapshot replication on demand and waits for its completion. This is something that you can’t do with out of the box functionality. While configuring the snapshot replication two SQL Agent Jobs are created, one for the creation of the snapshot and one for the distribution of the snapshot. Unfortunately these jobs are  asynchronous which means that if you execute them they immediately report back if the job started successfully or not, they do not wait for completion and report its result afterwards. So I had to create an SSIS package that executes the jobs and waits for their completion before the rest of the ETL process continues. Fortunately I was able to create the SSIS package with the desired functionality. I have made a step-by-step guide that will help you configure the snapshot replication and I have uploaded the SSIS package you need to execute it. Configure snapshot replication   The first step is to create a publication on the database you want to replicate. Connect to SQL Server Management Studio and right-click Replication, choose for New.. Publication…   The New Publication Wizard appears, click Next Choose your “source” database and click Next Choose Snapshot publication and click Next   You can now select tables and other objects that you want to publish Expand Tables and select the tables that are needed in your ETL process In the next screen you can add filters on the selected tables which can be very useful. Think about selecting only the last x days of data for example. Its possible to filter out rows and/or columns. In this example I did not apply any filters. Schedule the Snapshot Agent to run at a desired time, by doing this a SQL Agent Job is created which we need to execute from a SSIS package later on. Next you need to set the Security Settings for the Snapshot Agent. Click on the Security Settings button.   In this example I ran the Agent under the SQL Server Agent service account. This is not recommended as a security best practice. Fortunately there is an excellent article on TechNet which tells you exactly how to set up the security for replication services. Read it here and make sure you follow the guidelines!   On the next screen choose to create the publication at the end of the wizard Give the publication a name (SnapshotTest) and complete the wizard   The publication is created and the articles (tables in this case) are added Now the publication is created successfully its time to create a new subscription for this publication.   Expand the Replication folder in SSMS and right click Local Subscriptions, choose New Subscriptions   The New Subscription Wizard appears   Select the publisher on which you just created your publication and select the database and publication (SnapshotTest)   You can now choose where the Distribution Agent should run. If it runs at the distributor (push subscriptions) it causes extra processing overhead. If you use a separate server for your ETL process and databases choose to run each agent at its subscriber (pull subscriptions) to reduce the processing overhead at the distributor. Of course we need a database for the subscription and fortunately the Wizard can create it for you. Choose for New database   Give the database the desired name, set the desired options and click OK You can now add multiple SQL Server Subscribers which is not necessary in this case but can be very useful.   You now need to set the security settings for the Distribution Agent. Click on the …. button Again, in this example I ran the Agent under the SQL Server Agent service account. Read the security best practices here   Click Next   Make sure you create a synchronization job schedule again. This job is also necessary in the SSIS package later on. Initialize the subscription at first synchronization Select the first box to create the subscription when finishing this wizard Complete the wizard by clicking Finish The subscription will be created In SSMS you see a new database is created, the subscriber. There are no tables or other objects in the database available yet because the replication jobs did not ran yet. Now expand the SQL Server Agent, go to Jobs and search for the job that creates the snapshot:   Rename this job to “CreateSnapshot” Now search for the job that distributes the snapshot:   Rename this job to “DistributeSnapshot” Create an SSIS package that executes the snapshot replication We now need an SSIS package that will take care of the execution of both jobs. The CreateSnapshot job needs to execute and finish before the DistributeSnapshot job runs. After the DistributeSnapshot job has started the package needs to wait until its finished before the package execution finishes. The Execute SQL Server Agent Job Task is designed to execute SQL Agent Jobs from SSIS. Unfortunately this SSIS task only executes the job and reports back if the job started succesfully or not, it does not report if the job actually completed with success or failure. This is because these jobs are asynchronous. The SSIS package I’ve created does the following: It runs the CreateSnapshot job It checks every 5 seconds if the job is completed with a for loop When the CreateSnapshot job is completed it starts the DistributeSnapshot job And again it waits until the snapshot is delivered before the package will finish successfully Quite simple and the package is ready to use as standalone extract mechanism. After executing the package the replicated tables are added to the subscriber database and are filled with data:   Download the SSIS package here (SSIS 2008) Conclusion In this example I only replicated 5 tables, I could create a SSIS package that does the same in approximately the same amount of time. But if I replicated all the 70+ AdventureWorks tables I would save a lot of time and boring work! With replication services you also benefit from the feature that schema changes are applied automatically which means your entire extract phase wont break. Because a snapshot is created using the bcp utility (bulk copy) it’s also quite fast, so the performance will be quite good. Disadvantages of using snapshot replication as extraction tool is the limitation on source systems. You can only choose SQL Server or Oracle databases to act as a publisher. So if you plan to build an extract phase for your ETL process that will invoke a lot of tables think about replication services, it would save you a lot of time and thanks to the Extract SSIS package I’ve created you can perfectly fit it in your usual SSIS ETL process.

    Read the article

  • SQL SERVER – Merge Operations – Insert, Update, Delete in Single Execution

    - by pinaldave
    This blog post is written in response to T-SQL Tuesday hosted by Jorge Segarra (aka SQLChicken). I have been very active using these Merge operations in my development. However, I have found out from my consultancy work and friends that these amazing operations are not utilized by them most of the time. Here is my attempt to bring the necessity of using the Merge Operation to surface one more time. MERGE is a new feature that provides an efficient way to do multiple DML operations. In earlier versions of SQL Server, we had to write separate statements to INSERT, UPDATE, or DELETE data based on certain conditions; however, at present, by using the MERGE statement, we can include the logic of such data changes in one statement that even checks when the data is matched and then just update it, and similarly, when the data is unmatched, it is inserted. One of the most important advantages of MERGE statement is that the entire data are read and processed only once. In earlier versions, three different statements had to be written to process three different activities (INSERT, UPDATE or DELETE); however, by using MERGE statement, all the update activities can be done in one pass of database table. I have written about these Merge Operations earlier in my blog post over here SQL SERVER – 2008 – Introduction to Merge Statement – One Statement for INSERT, UPDATE, DELETE. I was asked by one of the readers that how do we know that this operator was doing everything in single pass and was not calling this Merge Operator multiple times. Let us run the same example which I have used earlier; I am listing the same here again for convenience. --Let’s create Student Details and StudentTotalMarks and inserted some records. USE tempdb GO CREATE TABLE StudentDetails ( StudentID INTEGER PRIMARY KEY, StudentName VARCHAR(15) ) GO INSERT INTO StudentDetails VALUES(1,'SMITH') INSERT INTO StudentDetails VALUES(2,'ALLEN') INSERT INTO StudentDetails VALUES(3,'JONES') INSERT INTO StudentDetails VALUES(4,'MARTIN') INSERT INTO StudentDetails VALUES(5,'JAMES') GO CREATE TABLE StudentTotalMarks ( StudentID INTEGER REFERENCES StudentDetails, StudentMarks INTEGER ) GO INSERT INTO StudentTotalMarks VALUES(1,230) INSERT INTO StudentTotalMarks VALUES(2,255) INSERT INTO StudentTotalMarks VALUES(3,200) GO -- Select from Table SELECT * FROM StudentDetails GO SELECT * FROM StudentTotalMarks GO -- Merge Statement MERGE StudentTotalMarks AS stm USING (SELECT StudentID,StudentName FROM StudentDetails) AS sd ON stm.StudentID = sd.StudentID WHEN MATCHED AND stm.StudentMarks > 250 THEN DELETE WHEN MATCHED THEN UPDATE SET stm.StudentMarks = stm.StudentMarks + 25 WHEN NOT MATCHED THEN INSERT(StudentID,StudentMarks) VALUES(sd.StudentID,25); GO -- Select from Table SELECT * FROM StudentDetails GO SELECT * FROM StudentTotalMarks GO -- Clean up DROP TABLE StudentDetails GO DROP TABLE StudentTotalMarks GO The Merge Join performs very well and the following result is obtained. Let us check the execution plan for the merge operator. You can click on following image to enlarge it. Let us evaluate the execution plan for the Table Merge Operator only. We can clearly see that the Number of Executions property suggests value 1. Which is quite clear that in a single PASS, the Merge Operation completes the operations of Insert, Update and Delete. I strongly suggest you all to use this operation, if possible, in your development. I have seen this operation implemented in many data warehousing applications. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, SQL, SQL Authority, SQL Joins, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: Merge

    Read the article

  • MERGE Bug with Filtered Indexes

    - by Paul White
    A MERGE statement can fail, and incorrectly report a unique key violation when: The target table uses a unique filtered index; and No key column of the filtered index is updated; and A column from the filtering condition is updated; and Transient key violations are possible Example Tables Say we have two tables, one that is the target of a MERGE statement, and another that contains updates to be applied to the target.  The target table contains three columns, an integer primary key, a single character alternate key, and a status code column.  A filtered unique index exists on the alternate key, but is only enforced where the status code is ‘a’: CREATE TABLE #Target ( pk integer NOT NULL, ak character(1) NOT NULL, status_code character(1) NOT NULL,   PRIMARY KEY (pk) );   CREATE UNIQUE INDEX uq1 ON #Target (ak) INCLUDE (status_code) WHERE status_code = 'a'; The changes table contains just an integer primary key (to identify the target row to change) and the new status code: CREATE TABLE #Changes ( pk integer NOT NULL, status_code character(1) NOT NULL,   PRIMARY KEY (pk) ); Sample Data The sample data for the example is: INSERT #Target (pk, ak, status_code) VALUES (1, 'A', 'a'), (2, 'B', 'a'), (3, 'C', 'a'), (4, 'A', 'd');   INSERT #Changes (pk, status_code) VALUES (1, 'd'), (4, 'a');          Target                     Changes +-----------------------+    +------------------+ ¦ pk ¦ ak ¦ status_code ¦    ¦ pk ¦ status_code ¦ ¦----+----+-------------¦    ¦----+-------------¦ ¦  1 ¦ A  ¦ a           ¦    ¦  1 ¦ d           ¦ ¦  2 ¦ B  ¦ a           ¦    ¦  4 ¦ a           ¦ ¦  3 ¦ C  ¦ a           ¦    +------------------+ ¦  4 ¦ A  ¦ d           ¦ +-----------------------+ The target table’s alternate key (ak) column is unique, for rows where status_code = ‘a’.  Applying the changes to the target will change row 1 from status ‘a’ to status ‘d’, and row 4 from status ‘d’ to status ‘a’.  The result of applying all the changes will still satisfy the filtered unique index, because the ‘A’ in row 1 will be deleted from the index and the ‘A’ in row 4 will be added. Merge Test One Let’s now execute a MERGE statement to apply the changes: MERGE #Target AS t USING #Changes AS c ON c.pk = t.pk WHEN MATCHED AND c.status_code <> t.status_code THEN UPDATE SET status_code = c.status_code; The MERGE changes the two target rows as expected.  The updated target table now contains: +-----------------------+ ¦ pk ¦ ak ¦ status_code ¦ ¦----+----+-------------¦ ¦  1 ¦ A  ¦ d           ¦ <—changed from ‘a’ ¦  2 ¦ B  ¦ a           ¦ ¦  3 ¦ C  ¦ a           ¦ ¦  4 ¦ A  ¦ a           ¦ <—changed from ‘d’ +-----------------------+ Merge Test Two Now let’s repopulate the changes table to reverse the updates we just performed: TRUNCATE TABLE #Changes;   INSERT #Changes (pk, status_code) VALUES (1, 'a'), (4, 'd'); This will change row 1 back to status ‘a’ and row 4 back to status ‘d’.  As a reminder, the current state of the tables is:          Target                        Changes +-----------------------+    +------------------+ ¦ pk ¦ ak ¦ status_code ¦    ¦ pk ¦ status_code ¦ ¦----+----+-------------¦    ¦----+-------------¦ ¦  1 ¦ A  ¦ d           ¦    ¦  1 ¦ a           ¦ ¦  2 ¦ B  ¦ a           ¦    ¦  4 ¦ d           ¦ ¦  3 ¦ C  ¦ a           ¦    +------------------+ ¦  4 ¦ A  ¦ a           ¦ +-----------------------+ We execute the same MERGE statement: MERGE #Target AS t USING #Changes AS c ON c.pk = t.pk WHEN MATCHED AND c.status_code <> t.status_code THEN UPDATE SET status_code = c.status_code; However this time we receive the following message: Msg 2601, Level 14, State 1, Line 1 Cannot insert duplicate key row in object 'dbo.#Target' with unique index 'uq1'. The duplicate key value is (A). The statement has been terminated. Applying the changes using UPDATE Let’s now rewrite the MERGE to use UPDATE instead: UPDATE t SET status_code = c.status_code FROM #Target AS t JOIN #Changes AS c ON t.pk = c.pk WHERE c.status_code <> t.status_code; This query succeeds where the MERGE failed.  The two rows are updated as expected: +-----------------------+ ¦ pk ¦ ak ¦ status_code ¦ ¦----+----+-------------¦ ¦  1 ¦ A  ¦ a           ¦ <—changed back to ‘a’ ¦  2 ¦ B  ¦ a           ¦ ¦  3 ¦ C  ¦ a           ¦ ¦  4 ¦ A  ¦ d           ¦ <—changed back to ‘d’ +-----------------------+ What went wrong with the MERGE? In this test, the MERGE query execution happens to apply the changes in the order of the ‘pk’ column. In test one, this was not a problem: row 1 is removed from the unique filtered index by changing status_code from ‘a’ to ‘d’ before row 4 is added.  At no point does the table contain two rows where ak = ‘A’ and status_code = ‘a’. In test two, however, the first change was to change row 1 from status ‘d’ to status ‘a’.  This change means there would be two rows in the filtered unique index where ak = ‘A’ (both row 1 and row 4 meet the index filtering criteria ‘status_code = a’). The storage engine does not allow the query processor to violate a unique key (unless IGNORE_DUP_KEY is ON, but that is a different story, and doesn’t apply to MERGE in any case).  This strict rule applies regardless of the fact that if all changes were applied, there would be no unique key violation (row 4 would eventually be changed from ‘a’ to ‘d’, removing it from the filtered unique index, and resolving the key violation). Why it went wrong The query optimizer usually detects when this sort of temporary uniqueness violation could occur, and builds a plan that avoids the issue.  I wrote about this a couple of years ago in my post Beware Sneaky Reads with Unique Indexes (you can read more about the details on pages 495-497 of Microsoft SQL Server 2008 Internals or in Craig Freedman’s blog post on maintaining unique indexes).  To summarize though, the optimizer introduces Split, Filter, Sort, and Collapse operators into the query plan to: Split each row update into delete followed by an inserts Filter out rows that would not change the index (due to the filter on the index, or a non-updating update) Sort the resulting stream by index key, with deletes before inserts Collapse delete/insert pairs on the same index key back into an update The effect of all this is that only net changes are applied to an index (as one or more insert, update, and/or delete operations).  In this case, the net effect is a single update of the filtered unique index: changing the row for ak = ‘A’ from pk = 4 to pk = 1.  In case that is less than 100% clear, let’s look at the operation in test two again:          Target                     Changes                   Result +-----------------------+    +------------------+    +-----------------------+ ¦ pk ¦ ak ¦ status_code ¦    ¦ pk ¦ status_code ¦    ¦ pk ¦ ak ¦ status_code ¦ ¦----+----+-------------¦    ¦----+-------------¦    ¦----+----+-------------¦ ¦  1 ¦ A  ¦ d           ¦    ¦  1 ¦ d           ¦    ¦  1 ¦ A  ¦ a           ¦ ¦  2 ¦ B  ¦ a           ¦    ¦  4 ¦ a           ¦    ¦  2 ¦ B  ¦ a           ¦ ¦  3 ¦ C  ¦ a           ¦    +------------------+    ¦  3 ¦ C  ¦ a           ¦ ¦  4 ¦ A  ¦ a           ¦                            ¦  4 ¦ A  ¦ d           ¦ +-----------------------+                            +-----------------------+ From the filtered index’s point of view (filtered for status_code = ‘a’ and shown in nonclustered index key order) the overall effect of the query is:   Before           After +---------+    +---------+ ¦ pk ¦ ak ¦    ¦ pk ¦ ak ¦ ¦----+----¦    ¦----+----¦ ¦  4 ¦ A  ¦    ¦  1 ¦ A  ¦ ¦  2 ¦ B  ¦    ¦  2 ¦ B  ¦ ¦  3 ¦ C  ¦    ¦  3 ¦ C  ¦ +---------+    +---------+ The single net change there is a change of pk from 4 to 1 for the nonclustered index entry ak = ‘A’.  This is the magic performed by the split, sort, and collapse.  Notice in particular how the original changes to the index key (on the ‘ak’ column) have been transformed into an update of a non-key column (pk is included in the nonclustered index).  By not updating any nonclustered index keys, we are guaranteed to avoid transient key violations. The Execution Plans The estimated MERGE execution plan that produces the incorrect key-violation error looks like this (click to enlarge in a new window): The successful UPDATE execution plan is (click to enlarge in a new window): The MERGE execution plan is a narrow (per-row) update.  The single Clustered Index Merge operator maintains both the clustered index and the filtered nonclustered index.  The UPDATE plan is a wide (per-index) update.  The clustered index is maintained first, then the Split, Filter, Sort, Collapse sequence is applied before the nonclustered index is separately maintained. There is always a wide update plan for any query that modifies the database. The narrow form is a performance optimization where the number of rows is expected to be relatively small, and is not available for all operations.  One of the operations that should disallow a narrow plan is maintaining a unique index where intermediate key violations could occur. Workarounds The MERGE can be made to work (producing a wide update plan with split, sort, and collapse) by: Adding all columns referenced in the filtered index’s WHERE clause to the index key (INCLUDE is not sufficient); or Executing the query with trace flag 8790 set e.g. OPTION (QUERYTRACEON 8790). Undocumented trace flag 8790 forces a wide update plan for any data-changing query (remember that a wide update plan is always possible).  Either change will produce a successfully-executing wide update plan for the MERGE that failed previously. Conclusion The optimizer fails to spot the possibility of transient unique key violations with MERGE under the conditions listed at the start of this post.  It incorrectly chooses a narrow plan for the MERGE, which cannot provide the protection of a split/sort/collapse sequence for the nonclustered index maintenance. The MERGE plan may fail at execution time depending on the order in which rows are processed, and the distribution of data in the database.  Worse, a previously solid MERGE query may suddenly start to fail unpredictably if a filtered unique index is added to the merge target table at any point. Connect bug filed here Tests performed on SQL Server 2012 SP1 CUI (build 11.0.3321) x64 Developer Edition © 2012 Paul White – All Rights Reserved Twitter: @SQL_Kiwi Email: [email protected]

    Read the article

  • MySQL Connect 8 Days Away - Replication Sessions

    - by Mat Keep
    Following on from my post about MySQL Cluster sessions at the forthcoming Connect conference, its now the turn of MySQL Replication - another technology at the heart of scaling and high availability for MySQL. Unless you've only just returned from a 6-month alien abduction, you will know that MySQL 5.6 includes the largest set of replication enhancements ever packaged into a single new release: - Global Transaction IDs + HA utilities for self-healing cluster..(yes both automatic failover and manual switchover available!) - Crash-safe slaves and binlog - Binlog Group Commit and Multi-Threaded Slaves for high performance - Replication Event Checksums and Time-Delayed replication - and many more There are a number of sessions dedicated to learn more about these important new enhancements, delivered by the same engineers who developed them. Here is a summary Saturday 29th, 13.00 Replication Tips and Tricks, Mats Kindahl In this session, the developers of MySQL Replication present a bag of useful tips and tricks related to the MySQL 5.5 GA and MySQL 5.6 development milestone releases, including multisource replication, using logs for auditing, handling filtering, examining the binary log, using relay slaves, splitting the replication stream, and handling failover. Saturday 29th, 17.30 Enabling the New Generation of Web and Cloud Services with MySQL 5.6 Replication, Lars Thalmann This session showcases the new replication features, including • High performance (group commit, multithreaded slave) • High availability (crash-safe slaves, failover utilities) • Flexibility and usability (global transaction identifiers, annotated row-based replication [RBR]) • Data integrity (event checksums) Saturday 29th, 1900 MySQL Replication Birds of a Feather In this session, the MySQL Replication engineers discuss all the goodies, including global transaction identifiers (GTIDs) with autofailover; multithreaded, crash-safe slaves; checksums; and more. The team discusses the design behind these enhancements and how to get started with them. You will get the opportunity to present your feedback on how these can be further enhanced and can share any additional replication requirements you have to further scale your critical MySQL-based workloads. Sunday 30th, 10.15 Hands-On Lab, MySQL Replication, Luis Soares and Sven Sandberg But how do you get started, how does it work, and what are the best practices and tools? During this hands-on lab, you will learn how to get started with replication, how it works, architecture, replication prerequisites, setting up a simple topology, and advanced replication configurations. The session also covers some of the new features in the MySQL 5.6 development milestone releases. Sunday 30th, 13.15 Hands-On Lab, MySQL Utilities, Chuck Bell Would you like to learn how to more effectively manage a host of MySQL servers and manage high-availability features such as replication? This hands-on lab addresses these areas and more. Participants will get familiar with all of the MySQL utilities, using each of them with a variety of options to configure and manage MySQL servers. Sunday 30th, 14.45 Eliminating Downtime with MySQL Replication, Luis Soares The presentation takes a deep dive into new replication features such as global transaction identifiers and crash-safe slaves. It also showcases a range of Python utilities that, combined with the Release 5.6 feature set, results in a self-healing data infrastructure. By the end of the session, attendees will be familiar with the new high-availability features in the whole MySQL 5.6 release and how to make use of them to protect and grow their business. Sunday 30th, 17.45 Scaling for the Web and the Cloud with MySQL Replication, Luis Soares In a Replication topology, high performance directly translates into improving read consistency from slaves and reducing the risk of data loss if a master fails. MySQL 5.6 introduces several new replication features to enhance performance. In this session, you will learn about these new features, how they work, and how you can leverage them in your applications. In addition, you will learn about some other best practices that can be used to improve performance. So how can you make sure you don't miss out - the good news is that registration is still open ;-) And just to whet your appetite, listen to the On-Demand webinar that presents an overview of MySQL 5.6 Replication.  

    Read the article

  • The current state of a MERGE Destination for SSIS

    - by jamiet
    Hugo Tap asked me on Twitter earlier today whether or not there existed a SSIS Dataflow Destination component that enabled one to MERGE data into a table rather than INSERT it. Its a common request so I thought it might be useful to summarise the current state of play as regards a MERGE destination for SSIS. Firstly, there is no MERGE destination component in the box; that is, when you install SSIS no MERGE Destination will be available. That being said the SSIS team have made available a MERGE destination component via Codeplex which you can get from http://sqlsrvintegrationsrv.codeplex.com/releases/view/19048. I have never used it so cannot vouch for its usefulness although judging by some of the reviews you might not want to set your expectations too high. Your mileage may vary.   In the past it has occurred to me that a built-in way to provide MERGE from the SSIS pipeline would be highly valuable. I assume that this would have to be provided by the database into which you were merging hence in March 2010 I submitted the following two requests to Connect: BULK MERGE (111 votes at the time of writing) [SSIS] BULK MERGE Destination (15 votes) If you think these would be useful feel free to vote them up and add a comment. Lastly, this one is nothing to do with SSIS but if you want to perform a minimally logged MERGE using T-SQL Sunil Agarwal has explained how at Minimal logging and MERGE statement. @Jamiet

    Read the article

  • PostgreSQL 9.1 Database Replication Between Two Production Environments with Load Balancer

    - by littleK
    I'm investigating different solutions for database replication between two PostgreSQL 9.1 databases. The setup will include two production servers on the cloud (Amazon EC2 X-Large Instances), with an elastic load balancer. What is the typical database implementation for for this type of setup? A master-master replication (with Bucardo or rubyrep)? Or perhaps use only one shared database between the two environments, with a shared disk failover? I've been getting some ideas from http://www.postgresql.org/docs/9.0/static/different-replication-solutions.html. Since I don't have a lot of experience in database replication, I figured I would ask the experts. What would you recommend for the described setup?

    Read the article

  • <solved> MySQL Replication A->B->C

    - by nonus25
    I was setting the MySQL Replication for master - slave/master - slave and Replication for master - slave its works fine but when i have enable this option in my.cnf log-slave-updates=1 for updating the master bin log my replications is starting be slower and the time Seconds_Behind_Master is growing. I use innodb engine but the DB is big. Any idea how i can improve the replication, looks like the network is not the issue. Also i was think to use binlog_format=ROW but master is using default setting for replication 'statement' and i cant reset master ;) Thanks ...

    Read the article

  • SQL 2008 R2 replication error: The process could not connect to Distributor

    - by Lance Lefebure
    I have two servers running SQL 2008 R2 Standard, each with an instance named "MAIN". I have a small test database on my primary server (one table, 13 rows) that I want to replicate to a second server as a proof-of-concept for some larger databases that I want to replicate. I set up the primary server to be a publisher and distributor, and set the database to do transactional replication. I copied the data to the second server via a backup/restore, not via a snapshot (which I'll have to do with the larger databases due to database size and limited bandwidth). I followed the instructions here: http://gnawgnu.blogspot.com/2009/11/sql-2008-transactional-replication-and.html Now on the subscriber, I go under Replication / Local Subscriptions / Right click / Properties on my subscription to the DB. The status of the last synchronization shows a status of: "The process could not connect to Distributor 'PRIMARYSERVER\MAIN'." Data IS replicating from the primary to the secondary. Any record I add on the primary shows up on the secondary server within seconds. Is the Distributor part of the Snapshot system that I'm not using, or is it part of the transaction replication stuff? Thanks, Lance

    Read the article

  • why use mixed-based replication for mysql

    - by Alistair Prestidge
    I am in the process of configuring MySQL replication and am intending to use row-based-replication but I was also reading up about mixed-based replication. This is where statement-based is the default and then for certain circumstances (http://dev.mysql.com/doc/refman/5.1/en/binary-log-mixed.html) MySQL will switch to row-based. The list is quit vast on when it will switch to row-based. My questions are: Does any one use mixed? If yes why did you chose this over just using one or the other? Thanks in advance

    Read the article

  • Merge Join component sorted outputs [SSIS]

    - by jamiet
    One question that I have been asked a few times of late in regard to performance tuning SSIS data flows is this: Why isn’t the Merge Join output sorted (i.e.IsSorted=True)? This is a fair question. After all both of the Merge Join inputs are sorted, hence why wouldn’t the output be sorted as well? Well here’s a little secret, the Merge Join output IS sorted! There’s a caveat though – it is only under certain circumstances and SSIS itself doesn’t do a good job of informing you of it. Let’s take a look at an example. Here we have a dataflow that consumes data from the [AdventureWorks2008].[Sales].[SalesOrderHeader] & [AdventureWorks2008].[Sales].[SalesOrderDetail] tables then joins them using a Merge Join component: Let’s take a look inside the editor of the Merge Join: We are joining on the [SalesOrderId] field (which is what the two inputs just happen to be sorted upon). We are also putting [SalesOrderHeader].[SalesOrderId] into the output. Believe it or not the output from this Merge Join component is sorted (i.e. has IsSorted=True) but unfortunately the Merge Join component does not have an Advanced Editor hence it is hidden away from us. There are a couple of ways to prove to you that is the case; I could open up the package XML inside the .dtsx file and show you the metadata but there is an easier way than that – I can attach a Sort component to the output. Take a look: Notice that the Sort component is attempting to sort on the [SalesOrderId] column. This gives us the following warning: Validation warning. DFT Get raw data: {992B7C9A-35AD-47B9-A0B0-637F7DDF93EB}: The data is already sorted as specified so the transform can be removed. The warning proves that the output from the Merge Join is sorted! It must be noted that the Merge Join output will only have IsSorted=True if at least one of the join columns is included in the output. So there you go, the Merge Join component can indeed produce a sorted output and that’s very useful in order to avoid unnecessary expensive Sort operations downstream. Hope this is useful to someone out there! @Jamiet  P.S. Thank you to Bob Bojanic on the SSIS product team who pointed this out to me!

    Read the article

  • Merge Join component sorted outputs [SSIS]

    - by jamiet
    One question that I have been asked a few times of late in regard to performance tuning SSIS data flows is this: Why isn’t the Merge Join output sorted (i.e.IsSorted=True)? This is a fair question. After all both of the Merge Join inputs are sorted, hence why wouldn’t the output be sorted as well? Well here’s a little secret, the Merge Join output IS sorted! There’s a caveat though – it is only under certain circumstances and SSIS itself doesn’t do a good job of informing you of it. Let’s take a look at an example. Here we have a dataflow that consumes data from the [AdventureWorks2008].[Sales].[SalesOrderHeader] & [AdventureWorks2008].[Sales].[SalesOrderDetail] tables then joins them using a Merge Join component: Let’s take a look inside the editor of the Merge Join: We are joining on the [SalesOrderId] field (which is what the two inputs just happen to be sorted upon). We are also putting [SalesOrderHeader].[SalesOrderId] into the output. Believe it or not the output from this Merge Join component is sorted (i.e. has IsSorted=True) but unfortunately the Merge Join component does not have an Advanced Editor hence it is hidden away from us. There are a couple of ways to prove to you that is the case; I could open up the package XML inside the .dtsx file and show you the metadata but there is an easier way than that – I can attach a Sort component to the output. Take a look: Notice that the Sort component is attempting to sort on the [SalesOrderId] column. This gives us the following warning: Validation warning. DFT Get raw data: {992B7C9A-35AD-47B9-A0B0-637F7DDF93EB}: The data is already sorted as specified so the transform can be removed. The warning proves that the output from the Merge Join is sorted! It must be noted that the Merge Join output will only have IsSorted=True if at least one of the join columns is included in the output. So there you go, the Merge Join component can indeed produce a sorted output and that’s very useful in order to avoid unnecessary expensive Sort operations downstream. Hope this is useful to someone out there! @Jamiet  P.S. Thank you to Bob Bojanic on the SSIS product team who pointed this out to me!

    Read the article

  • MySQL Database Replication and Server Load

    - by Willy
    Hi Everyone, I have an online service with around 5000 MySQL databases. Now, I am interested in building a development area of the exactly same environment in my office, therefore, I am about to setup MySQL replication between my live MySQL server and development MySQL server. But my concern is the load which will occur on my live MySQL server once replication is started. Do you have any experience? Will this process cause extra load on my production server? Thanks, have a nice weekend.

    Read the article

  • Difference between EJB Persist & Merge operation

    - by shantala.sankeshwar
    This article gives the difference between EJB Persist & Merge operations with scenarios.Use Case Description Users working on EJB persist & merge operations often have this question in mind " When merge can create new entity as well as modify existing entity,then why do we have 2 separate operations - persist & merge?" The reason is very simple.If we use merge operation to create new entity & if the entity exists then it does not throw any exception,but persist throws exception if the entity already exists.Merge should be used to modify the existing entity.The sql statement that gets executed on persist operation is insert statement.But in case of merge first select statement gets executed & then update sql statement gets executed.Scenario 1: Persist operation to create new Emp recordLet us suppose that we have a Java EE Web Application created with Entities from Emp table & have created session bean with data control. Drop Emp Object(Expand SessionEJBLocal->Constructors under Data Controls) as ADF Parameter form in jspx pageDrop persistEmp(Emp) as ADF CommandButton & provide #{bindings.EmpIterator.currentRow.dataProvider} as the value for emp parameter.Then run this page & provide values for Emp,click on 'persistEmp' button.New Emp record gets created.So when we execute persist operation only insert sql statement gets executed :INSERT INTO EMP (EMPNO, COMM, HIREDATE, ENAME, JOB, DEPTNO, SAL, MGR) VALUES (?, ?, ?, ?, ?, ?, ?, ?)    bind => [2, null, null, e2, null, 10, null, null]Scenario 2: Merge operation to modify existing Emp recordLet us suppose that we have a Java EE Web Application created with Entities from Emp table & have created session bean with data control.Drop empFindAll() Object as ADF form on jspx page.Drop mergeEmp(Emp) operation as commandButton & provide #{bindings.EmpIterator.currentRow.dataProvider} as the value for emp parameter.Then run this page & modify values for Emp record,click on 'mergeEmp' button.The respective Emp record gets modified.So when we execute merge operation select & update sql statements gets executed :SELECT EMPNO, COMM, HIREDATE, ENAME, JOB, DEPTNO, SAL, MGR FROM EMP WHERE (EMPNO = ?) bind => [7566]UPDATE EMP SET ENAME = ? WHERE (EMPNO = ?) bind => [KINGS, 7839]

    Read the article

  • T-SQL (SCD) Slowly Changing Dimension Type 2 using a merge statement

    - by AtulThakor
    Working on stored procedure recently which loads records into a data warehouse I found that the existing record was being expired using an update statement followed by an insert to add the new active record. Playing around with the merge statement you can actually expire the current record and insert a new record within one clean statement. This is how the statement works, we do the normal merge statement to insert a record when there is no match, if we match the record we update the existing record by expiring it and deactivating. At the end of the merge statement we use the output statement to output the staging values for the update,  we wrap the whole merge statement within an insert statement and add new rows for the records which we inserted. I’ve added the full script at the bottom so you can paste it and play around.   1: INSERT INTO ExampleFactUpdate 2: (PolicyID, 3: Status) 4: SELECT -- these columns are returned from the output statement 5: PolicyID, 6: Status 7: FROM 8: ( 9: -- merge statement on unique id in this case Policy_ID 10: MERGE dbo.ExampleFactUpdate dp 11: USING dbo.ExampleStag s 12: ON dp.PolicyID = s.PolicyID 13: WHEN NOT MATCHED THEN -- when we cant match the record we insert a new record record and this is all that happens 14: INSERT (PolicyID,Status) 15: VALUES (s.PolicyID, s.Status) 16: WHEN MATCHED --if it already exists 17: AND ExpiryDate IS NULL -- and the Expiry Date is null 18: THEN 19: UPDATE 20: SET 21: dp.ExpiryDate = getdate(), --we set the expiry on the existing record 22: dp.Active = 0 -- and deactivate the existing record 23: OUTPUT $Action MergeAction, s.PolicyID, s.Status -- the output statement returns a merge action which can 24: ) MergeOutput -- be insert/update/delete, on our example where a record has been updated (or expired in our case 25: WHERE -- we'll filter using a where clause 26: MergeAction = 'Update'; -- here   Complete source for example 1: if OBJECT_ID('ExampleFactUpdate') > 0 2: drop table ExampleFactUpdate 3:  4: Create Table ExampleFactUpdate( 5: ID int identity(1,1), 3: go 6: PolicyID varchar(100), 7: Status varchar(100), 8: EffectiveDate datetime default getdate(), 9: ExpiryDate datetime, 10: Active bit default 1 11: ) 12:  13:  14: insert into ExampleFactUpdate( 15: PolicyID, 16: Status) 17: select 18: 1, 19: 'Live' 20:  21: /*Create Staging Table*/ 22: if OBJECT_ID('ExampleStag') > 0 23: drop table ExampleStag 24: go 25:  26: /*Create example fact table */ 27: Create Table ExampleStag( 28: PolicyID varchar(100), 29: Status varchar(100)) 30:  31: --add some data 32: insert into ExampleStag( 33: PolicyID, 34: Status) 35: select 36: 1, 37: 'Lapsed' 38: union all 39: select 40: 2, 41: 'Quote' 42:  43: select * 44: from ExampleFactUpdate 45:  46: select * 47: from ExampleStag 48:  49:  50: INSERT INTO ExampleFactUpdate 51: (PolicyID, 52: Status) 53: SELECT -- these columns are returned from the output statement 54: PolicyID, 55: Status 56: FROM 57: ( 58: -- merge statement on unique id in this case Policy_ID 59: MERGE dbo.ExampleFactUpdate dp 60: USING dbo.ExampleStag s 61: ON dp.PolicyID = s.PolicyID 62: WHEN NOT MATCHED THEN -- when we cant match the record we insert a new record record and this is all that happens 63: INSERT (PolicyID,Status) 64: VALUES (s.PolicyID, s.Status) 65: WHEN MATCHED --if it already exists 66: AND ExpiryDate IS NULL -- and the Expiry Date is null 67: THEN 68: UPDATE 69: SET 70: dp.ExpiryDate = getdate(), --we set the expiry on the existing record 71: dp.Active = 0 -- and deactivate the existing record 72: OUTPUT $Action MergeAction, s.PolicyID, s.Status -- the output statement returns a merge action which can 73: ) MergeOutput -- be insert/update/delete, on our example where a record has been updated (or expired in our case 74: WHERE -- we'll filter using a where clause 75: MergeAction = 'Update'; -- here 76:  77:  78: select * 79: from ExampleFactUpdate 80: 

    Read the article

  • Monitor SQL Server Replication Jobs

    - by Yaniv Etrogi
    The Replication infrastructure in SQL Server is implemented using SQL Server Agent to execute the various components involved in the form of a job (e.g. LogReader agent job, Distribution agent job, Merge agent job) SQL Server jobs execute a binary executable file which is basically C++ code. You can download all the scripts for this article here SQL Server Job Schedules By default each of job has only one schedule that is set to Start automatically when SQL Server Agent starts. This schedule ensures that when ever the SQL Server Agent service is started all the replication components are also put into action. This is OK and makes sense but there is one problem with this default configuration that needs improvement  -  if for any reason one of the components fails it remains down in a stopped state.   Unless you monitor the status of each component you will typically get to know about such a failure from a customer complaint as a result of missing data or data that is not up to date at the subscriber level. Furthermore, having any of these components in a stopped state can lead to more severe problems if not corrected within a short time. The action required to improve on this default settings is in fact very simple. Adding a second schedule that is set as a Daily Reoccurring schedule which runs every 1 minute does the trick. SQL Server Agent’s scheduler module knows how to handle overlapping schedules so if the job is already being executed by another schedule it will not get executed again at the same time. So, in the event of a failure the failed job remains down for at most 60 seconds. Many DBAs are not aware of this capability and so search for more complex solutions such as having an additional dedicated job running an external code in VBS or another scripting language that detects replication jobs in a stopped state and starts them but there is no need to seek such external solutions when what is needed can be accomplished by T-SQL code. SQL Server Jobs Status In addition to the 1 minute schedule we also want to ensure that key components in the replication are enabled so I can search for those components by their Category, and set their status to enabled in case they are disabled, by executing the stored procedure MonitorEnableReplicationAgents. The jobs that I typically have handled are listed below but you may want to extend this, so below is the query to return all jobs along with their category. SELECT category_id, name FROM msdb.dbo.syscategories ORDER BY category_id; Distribution Cleanup LogReader Agent Distribution Agent Snapshot Agent Jobs By default when a publication is created, a snapshot agent job also gets created with a daily schedule. I see more organizations where the snapshot agent job does not need to be executed automatically by the SQL Server Agent  scheduler than organizations who   need a new snapshot generated automatically. To assure this setting is in place I created the stored procedure MonitorSnapshotAgentsSchedules which disables snapshot agent jobs and also deletes the job schedule. It is worth mentioning that when the publication property immediate_sync is turned off then the snapshot files are not created when the Snapshot agent is executed by the job. You control this property when the publication is created with a parameter called @immediate_sync passed to sp_addpublication and for an existing publication you can use sp_changepublication. Implementation The scripts assume the existence of a database named PerfDB. Steps: Run the scripts to create the stored procedures in the PerfDB database. Create a job that executes the stored procedures every hour. -- Verify that the 1_Minute schedule exists. EXEC PerfDB.dbo.MonitorReplicationAgentsSchedules @CategoryId = 10; /* Distribution */ EXEC PerfDB.dbo.MonitorReplicationAgentsSchedules @CategoryId = 13; /* LogReader */ -- Verify all replication agents are enabled. EXEC PerfDB.dbo.MonitorEnableReplicationAgents @CategoryId = 10; /* Distribution */ EXEC PerfDB.dbo.MonitorEnableReplicationAgents @CategoryId = 13; /* LogReader */ EXEC PerfDB.dbo.MonitorEnableReplicationAgents @CategoryId = 11; /* Distribution clean up */ -- Verify that Snapshot agents are disabled and have no schedule EXEC PerfDB.dbo.MonitorSnapshotAgentsSchedules; Want to read more of about replication? Check at my replication posts at my blog.

    Read the article

  • TFS 2010 SDK: Smart Merge - Programmatically Create your own Merge Tool

    - by Tarun Arora
    Technorati Tags: Team Foundation Server 2010,TFS SDK,TFS API,TFS Merge Programmatically,TFS Work Items Programmatically,TFS Administration Console,ALM   The information available in the Merge window in Team Foundation Server 2010 is very important in the decision making during the merging process. However, at present the merge window shows very limited information, more that often you are interested to know the work item, files modified, code reviewer notes, policies overridden, etc associated with the change set. Our friends at Microsoft are working hard to change the game again with vNext, but because at present the merge window is a model window you have to cancel the merge process and go back one after the other to check the additional information you need. If you can relate to what i am saying, you will enjoy this blog post! I will show you how to programmatically create your own merging window using the TFS 2010 API. A few screen shots of the WPF TFS 2010 API – Custom Merging Application that we will be creating programmatically, Excited??? Let’s start coding… 1. Get All Team Project Collections for the TFS Server You can read more on connecting to TFS programmatically on my blog post => How to connect to TFS Programmatically 1: public static ReadOnlyCollection<CatalogNode> GetAllTeamProjectCollections() 2: { 3: TfsConfigurationServer configurationServer = 4: TfsConfigurationServerFactory. 5: GetConfigurationServer(new Uri("http://xxx:8080/tfs/")); 6: 7: CatalogNode catalogNode = configurationServer.CatalogNode; 8: return catalogNode.QueryChildren(new Guid[] 9: { CatalogResourceTypes.ProjectCollection }, 10: false, CatalogQueryOptions.None); 11: } 2. Get All Team Projects for the selected Team Project Collection You can read more on connecting to TFS programmatically on my blog post => How to connect to TFS Programmatically 1: public static ReadOnlyCollection<CatalogNode> GetTeamProjects(string instanceId) 2: { 3: ReadOnlyCollection<CatalogNode> teamProjects = null; 4: 5: TfsConfigurationServer configurationServer = 6: TfsConfigurationServerFactory.GetConfigurationServer(new Uri("http://xxx:8080/tfs/")); 7: 8: CatalogNode catalogNode = configurationServer.CatalogNode; 9: var teamProjectCollections = catalogNode.QueryChildren(new Guid[] {CatalogResourceTypes.ProjectCollection }, 10: false, CatalogQueryOptions.None); 11: 12: foreach (var teamProjectCollection in teamProjectCollections) 13: { 14: if (string.Compare(teamProjectCollection.Resource.Properties["InstanceId"], instanceId, true) == 0) 15: { 16: teamProjects = teamProjectCollection.QueryChildren(new Guid[] { CatalogResourceTypes.TeamProject }, false, 17: CatalogQueryOptions.None); 18: } 19: } 20: 21: return teamProjects; 22: } 3. Get All Branches with in a Team Project programmatically I will be passing the name of the Team Project for which i want to retrieve all the branches. When consuming the ‘Version Control Service’ you have the method QueryRootBranchObjects, you need to pass the recursion type => none, one, full. Full implies you are interested in all branches under that root branch. 1: public static List<BranchObject> GetParentBranch(string projectName) 2: { 3: var branches = new List<BranchObject>(); 4: 5: var tfs = TfsTeamProjectCollectionFactory.GetTeamProjectCollection(new Uri("http://<ServerName>:8080/tfs/<teamProjectName>")); 6: var versionControl = tfs.GetService<VersionControlServer>(); 7: 8: var allBranches = versionControl.QueryRootBranchObjects(RecursionType.Full); 9: 10: foreach (var branchObject in allBranches) 11: { 12: if (branchObject.Properties.RootItem.Item.ToUpper().Contains(projectName.ToUpper())) 13: { 14: branches.Add(branchObject); 15: } 16: } 17: 18: return branches; 19: } 4. Get All Branches associated to the Parent Branch Programmatically Now that we have the parent branch, it is important to retrieve all child branches of that parent branch. Lets see how we can achieve this using the TFS API. 1: public static List<ItemIdentifier> GetChildBranch(string parentBranch) 2: { 3: var branches = new List<ItemIdentifier>(); 4: 5: var tfs = TfsTeamProjectCollectionFactory.GetTeamProjectCollection(new Uri("http://<ServerName>:8080/tfs/<CollectionName>")); 6: var versionControl = tfs.GetService<VersionControlServer>(); 7: 8: var i = new ItemIdentifier(parentBranch); 9: var allBranches = 10: versionControl.QueryBranchObjects(i, RecursionType.None); 11: 12: foreach (var branchObject in allBranches) 13: { 14: foreach (var childBranche in branchObject.ChildBranches) 15: { 16: branches.Add(childBranche); 17: } 18: } 19: 20: return branches; 21: } 5. Get Merge candidates between two branches Programmatically Now that we have the parent and the child branch that we are interested to perform a merge between we will use the method ‘GetMergeCandidates’ in the namespace ‘Microsoft.TeamFoundation.VersionControl.Client’ => http://msdn.microsoft.com/en-us/library/bb138934(v=VS.100).aspx 1: public static MergeCandidate[] GetMergeCandidates(string fromBranch, string toBranch) 2: { 3: var tfs = TfsTeamProjectCollectionFactory.GetTeamProjectCollection(new Uri("http://<ServerName>:8080/tfs/<CollectionName>")); 4: var versionControl = tfs.GetService<VersionControlServer>(); 5: 6: return versionControl.GetMergeCandidates(fromBranch, toBranch, RecursionType.Full); 7: } 6. Get changeset details Programatically Now that we have the changeset id that we are interested in, we can get details of the changeset. The Changeset object contains the properties => http://msdn.microsoft.com/en-us/library/microsoft.teamfoundation.versioncontrol.client.changeset.aspx - Changes: Gets or sets an array of Change objects that comprise this changeset. - CheckinNote: Gets or sets the check-in note of the changeset. - Comment: Gets or sets the comment of the changeset. - PolicyOverride: Gets or sets the policy override information of this changeset. - WorkItems: Gets an array of work items that are associated with this changeset. 1: public static Changeset GetChangeSetDetails(int changeSetId) 2: { 3: var tfs = TfsTeamProjectCollectionFactory.GetTeamProjectCollection(new Uri("http://<ServerName>:8080/tfs/<CollectionName>")); 4: var versionControl = tfs.GetService<VersionControlServer>(); 5: 6: return versionControl.GetChangeset(changeSetId); 7: } 7. Possibilities In future posts i will try and extend this idea to explore further possibilities, but few features that i am sure will further help during the merge decision making process would be, - View changed files - Compare modified file with current/previous version - Merge Preview - Last Merge date Any other features that you can think of?

    Read the article

  • Alter Stored Procedure in SQL Replication

    - by Refracted Paladin
    How do I, properly, ALTER a StoredProcedure in a SQL 2005 Merge Replication? I just need to add a Column. I already successfully added it to the Table and I now need to add it to a SP. I did so but now it will not synchronize with the following error -- Insert Error: Column name or number of supplied values does not match table definition. (Source: MSSQLServer, Error number: 213)

    Read the article

  • SQL Server 2008 - Performance impact of transactional replication?

    - by cxfx
    I'm planning to set up transactional replication for a 100Gb SQL Server 2008 database. I have the distributor and publisher on the same server, and am using push subscription. Should there be a performance impact on my publisher server when it creates the initial snapshot, and synchronises it with a subscriber? From what I've tried so far on a staging server, it seems to slow right down. Is there a better way to create the initial snapshot without impacting my production publisher server?

    Read the article

  • Table modifications while running db replication (MS SQL 2008)

    - by typemismatch
    I'm running SQL Server 2008 Std with a database that is being published in a "Transactional Publication" to a single subscriber. We are unable to make any changes to the tables on the publisher without getting the "cannot modify table because it is published for replication". This seems odd because schema changes (or scripts run to do this) should be pushed to the subscriber. We currently have to drop the entire publication system to make table changes. What am I missing? There must be a way to update the publisher tables? thanks!

    Read the article

1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >