Search Results

Search found 1054 results on 43 pages for 'replication'.

Page 1/43 | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >

  • Coherence - How to develop a custom push replication publisher

    - by cosmin.tudor(at)oracle.com
    CoherencePushReplicationDB.zipIn the example bellow I'm describing a way of developing a custom push replication publisher that publishes data to a database via JDBC. This example can be easily changed to publish data to other receivers (JMS,...) by performing changes to step 2 and small changes to step 3, steps that are presented bellow. I've used Eclipse as the development tool. To develop a custom push replication publishers we will need to go through 6 steps: Step 1: Create a custom publisher scheme class Step 2: Create a custom publisher class that should define what the publisher is doing. Step 3: Create a class data is performing the actions (publish to JMS, DB, etc ) for the custom publisher. Step 4: Register the new publisher against a ContentHandler. Step 5: Add the new custom publisher in the cache configuration file. Step 6: Add the custom publisher scheme class to the POF configuration file. All these steps are detailed bellow. The coherence project is attached and conclusions are presented at the end. Step 1: In the Coherence Eclipse project create a class called CustomPublisherScheme that should implement com.oracle.coherence.patterns.pushreplication.publishers.AbstractPublisherScheme. In this class define the elements of the custom-publisher-scheme element. For instance for a CustomPublisherScheme that looks like that: <sync:publisher> <sync:publisher-name>Active2-JDBC-Publisher</sync:publisher-name> <sync:publisher-scheme> <sync:custom-publisher-scheme> <sync:jdbc-string>jdbc:oracle:thin:@machine-name:1521:XE</sync:jdbc-string> <sync:username>hr</sync:username> <sync:password>hr</sync:password> </sync:custom-publisher-scheme> </sync:publisher-scheme> </sync:publisher> the code is: package com.oracle.coherence; import java.io.DataInput; import java.io.DataOutput; import java.io.IOException; import com.oracle.coherence.patterns.pushreplication.Publisher; import com.oracle.coherence.configuration.Configurable; import com.oracle.coherence.configuration.Mandatory; import com.oracle.coherence.configuration.Property; import com.oracle.coherence.configuration.parameters.ParameterScope; import com.oracle.coherence.environment.Environment; import com.tangosol.io.pof.PofReader; import com.tangosol.io.pof.PofWriter; import com.tangosol.util.ExternalizableHelper; @Configurable public class CustomPublisherScheme extends com.oracle.coherence.patterns.pushreplication.publishers.AbstractPublisherScheme { /** * */ private static final long serialVersionUID = 1L; private String jdbcString; private String username; private String password; public String getJdbcString() { return this.jdbcString; } @Property("jdbc-string") @Mandatory public void setJdbcString(String jdbcString) { this.jdbcString = jdbcString; } public String getUsername() { return username; } @Property("username") @Mandatory public void setUsername(String username) { this.username = username; } public String getPassword() { return password; } @Property("password") @Mandatory public void setPassword(String password) { this.password = password; } public Publisher realize(Environment environment, ClassLoader classLoader, ParameterScope parameterScope) { return new CustomPublisher(getJdbcString(), getUsername(), getPassword()); } public void readExternal(DataInput in) throws IOException { super.readExternal(in); this.jdbcString = ExternalizableHelper.readSafeUTF(in); this.username = ExternalizableHelper.readSafeUTF(in); this.password = ExternalizableHelper.readSafeUTF(in); } public void writeExternal(DataOutput out) throws IOException { super.writeExternal(out); ExternalizableHelper.writeSafeUTF(out, this.jdbcString); ExternalizableHelper.writeSafeUTF(out, this.username); ExternalizableHelper.writeSafeUTF(out, this.password); } public void readExternal(PofReader reader) throws IOException { super.readExternal(reader); this.jdbcString = reader.readString(100); this.username = reader.readString(101); this.password = reader.readString(102); } public void writeExternal(PofWriter writer) throws IOException { super.writeExternal(writer); writer.writeString(100, this.jdbcString); writer.writeString(101, this.username); writer.writeString(102, this.password); } } Step 2: Define what the CustomPublisher should basically do by creating a new java class called CustomPublisher that implements com.oracle.coherence.patterns.pushreplication.Publisher package com.oracle.coherence; import com.oracle.coherence.patterns.pushreplication.EntryOperation; import com.oracle.coherence.patterns.pushreplication.Publisher; import com.oracle.coherence.patterns.pushreplication.exceptions.PublisherNotReadyException; import java.io.BufferedWriter; import java.util.Iterator; public class CustomPublisher implements Publisher { private String jdbcString; private String username; private String password; private transient BufferedWriter bufferedWriter; public CustomPublisher() { } public CustomPublisher(String jdbcString, String username, String password) { this.jdbcString = jdbcString; this.username = username; this.password = password; this.bufferedWriter = null; } public String getJdbcString() { return this.jdbcString; } public String getUsername() { return username; } public String getPassword() { return password; } public void publishBatch(String cacheName, String publisherName, Iterator<EntryOperation> entryOperations) { DatabasePersistence databasePersistence = new DatabasePersistence( jdbcString, username, password); while (entryOperations.hasNext()) { EntryOperation entryOperation = (EntryOperation) entryOperations .next(); databasePersistence.databasePersist(entryOperation); } } public void start(String cacheName, String publisherName) throws PublisherNotReadyException { System.err .printf("Started: Custom JDBC Publisher for Cache %s with Publisher %s\n", new Object[] { cacheName, publisherName }); } public void stop(String cacheName, String publisherName) { System.err .printf("Stopped: Custom JDBC Publisher for Cache %s with Publisher %s\n", new Object[] { cacheName, publisherName }); } } In the publishBatch method from above we inform the publisher that he is supposed to persist data to a database: DatabasePersistence databasePersistence = new DatabasePersistence( jdbcString, username, password); while (entryOperations.hasNext()) { EntryOperation entryOperation = (EntryOperation) entryOperations .next(); databasePersistence.databasePersist(entryOperation); } Step 3: The class that deals with the persistence is a very basic one that uses JDBC to perform inserts/updates against a database. package com.oracle.coherence; import com.oracle.coherence.patterns.pushreplication.EntryOperation; import java.sql.*; import java.text.SimpleDateFormat; import com.oracle.coherence.Order; public class DatabasePersistence { public static String INSERT_OPERATION = "INSERT"; public static String UPDATE_OPERATION = "UPDATE"; public Connection dbConnection; public DatabasePersistence(String jdbcString, String username, String password) { this.dbConnection = createConnection(jdbcString, username, password); } public Connection createConnection(String jdbcString, String username, String password) { Connection connection = null; System.err.println("Connecting to: " + jdbcString + " Username: " + username + " Password: " + password); try { // Load the JDBC driver String driverName = "oracle.jdbc.driver.OracleDriver"; Class.forName(driverName); // Create a connection to the database connection = DriverManager.getConnection(jdbcString, username, password); System.err.println("Connected to:" + jdbcString + " Username: " + username + " Password: " + password); } catch (ClassNotFoundException e) { e.printStackTrace(); } // driver catch (SQLException e) { e.printStackTrace(); } return connection; } public void databasePersist(EntryOperation entryOperation) { if (entryOperation.getOperation().toString() .equalsIgnoreCase(INSERT_OPERATION)) { insert(((Order) entryOperation.getPublishableEntry().getValue())); } else if (entryOperation.getOperation().toString() .equalsIgnoreCase(UPDATE_OPERATION)) { update(((Order) entryOperation.getPublishableEntry().getValue())); } } public void update(Order order) { String update = "UPDATE Orders set QUANTITY= '" + order.getQuantity() + "', AMOUNT='" + order.getAmount() + "', ORD_DATE= '" + (new SimpleDateFormat("dd-MMM-yyyy")).format(order .getOrdDate()) + "' WHERE SYMBOL='" + order.getSymbol() + "'"; System.err.println("UPDATE = " + update); try { Statement stmt = getDbConnection().createStatement(); stmt.execute(update); stmt.close(); } catch (SQLException ex) { System.err.println("SQLException: " + ex.getMessage()); } } public void insert(Order order) { String insert = "insert into Orders values('" + order.getSymbol() + "'," + order.getQuantity() + "," + order.getAmount() + ",'" + (new SimpleDateFormat("dd-MMM-yyyy")).format(order .getOrdDate()) + "')"; System.err.println("INSERT = " + insert); try { Statement stmt = getDbConnection().createStatement(); stmt.execute(insert); stmt.close(); } catch (SQLException ex) { System.err.println("SQLException: " + ex.getMessage()); } } public Connection getDbConnection() { return dbConnection; } public void setDbConnection(Connection dbConnection) { this.dbConnection = dbConnection; } } Step 4: Now we need to register our publisher against a ContentHandler. In order to achieve that we need to create in our eclipse project a new class called CustomPushReplicationNamespaceContentHandler that should extend the com.oracle.coherence.patterns.pushreplication.configuration.PushReplicationNamespaceContentHandler. In the constructor of the new class we define a new handler for our custom publisher. package com.oracle.coherence; import com.oracle.coherence.configuration.Configurator; import com.oracle.coherence.environment.extensible.ConfigurationContext; import com.oracle.coherence.environment.extensible.ConfigurationException; import com.oracle.coherence.environment.extensible.ElementContentHandler; import com.oracle.coherence.patterns.pushreplication.PublisherScheme; import com.oracle.coherence.environment.extensible.QualifiedName; import com.oracle.coherence.patterns.pushreplication.configuration.PushReplicationNamespaceContentHandler; import com.tangosol.run.xml.XmlElement; public class CustomPushReplicationNamespaceContentHandler extends PushReplicationNamespaceContentHandler { public CustomPushReplicationNamespaceContentHandler() { super(); registerContentHandler("custom-publisher-scheme", new ElementContentHandler() { public Object onElement(ConfigurationContext context, QualifiedName qualifiedName, XmlElement xmlElement) throws ConfigurationException { PublisherScheme publisherScheme = new CustomPublisherScheme(); Configurator.configure(publisherScheme, context, qualifiedName, xmlElement); return publisherScheme; } }); } } Step 5: Now we should define our CustomPublisher in the cache configuration file according to the following documentation. <cache-config xmlns:sync="class:com.oracle.coherence.CustomPushReplicationNamespaceContentHandler" xmlns:cr="class:com.oracle.coherence.environment.extensible.namespaces.InstanceNamespaceContentHandler"> <caching-schemes> <sync:provider pof-enabled="false"> <sync:coherence-provider /> </sync:provider> <caching-scheme-mapping> <cache-mapping> <cache-name>publishing-cache</cache-name> <scheme-name>distributed-scheme-with-publishing-cachestore</scheme-name> <autostart>true</autostart> <sync:publisher> <sync:publisher-name>Active2 Publisher</sync:publisher-name> <sync:publisher-scheme> <sync:remote-cluster-publisher-scheme> <sync:remote-invocation-service-name>remote-site1</sync:remote-invocation-service-name> <sync:remote-publisher-scheme> <sync:local-cache-publisher-scheme> <sync:target-cache-name>publishing-cache</sync:target-cache-name> </sync:local-cache-publisher-scheme> </sync:remote-publisher-scheme> <sync:autostart>true</sync:autostart> </sync:remote-cluster-publisher-scheme> </sync:publisher-scheme> </sync:publisher> <sync:publisher> <sync:publisher-name>Active2-Output-Publisher</sync:publisher-name> <sync:publisher-scheme> <sync:stderr-publisher-scheme> <sync:autostart>true</sync:autostart> <sync:publish-original-value>true</sync:publish-original-value> </sync:stderr-publisher-scheme> </sync:publisher-scheme> </sync:publisher> <sync:publisher> <sync:publisher-name>Active2-JDBC-Publisher</sync:publisher-name> <sync:publisher-scheme> <sync:custom-publisher-scheme> <sync:jdbc-string>jdbc:oracle:thin:@machine_name:1521:XE</sync:jdbc-string> <sync:username>hr</sync:username> <sync:password>hr</sync:password> </sync:custom-publisher-scheme> </sync:publisher-scheme> </sync:publisher> </cache-mapping> </caching-scheme-mapping> <!-- The following scheme is required for each remote-site when using a RemoteInvocationPublisher --> <remote-invocation-scheme> <service-name>remote-site1</service-name> <initiator-config> <tcp-initiator> <remote-addresses> <socket-address> <address>localhost</address> <port>20001</port> </socket-address> </remote-addresses> <connect-timeout>2s</connect-timeout> </tcp-initiator> <outgoing-message-handler> <request-timeout>5s</request-timeout> </outgoing-message-handler> </initiator-config> </remote-invocation-scheme> <!-- END: com.oracle.coherence.patterns.pushreplication --> <proxy-scheme> <service-name>ExtendTcpProxyService</service-name> <acceptor-config> <tcp-acceptor> <local-address> <address>localhost</address> <port>20002</port> </local-address> </tcp-acceptor> </acceptor-config> <autostart>true</autostart> </proxy-scheme> </caching-schemes> </cache-config> As you can see in the red-marked text from above I've:       - set new Namespace Content Handler       - define the new custom publisher that should work together with other publishers like: stderr and remote publishers in our case. Step 6: Add the com.oracle.coherence.CustomPublisherScheme to your custom-pof-config file: <pof-config> <user-type-list> <!-- Built in types --> <include>coherence-pof-config.xml</include> <include>coherence-common-pof-config.xml</include> <include>coherence-messagingpattern-pof-config.xml</include> <include>coherence-pushreplicationpattern-pof-config.xml</include> <!-- Application types --> <user-type> <type-id>1901</type-id> <class-name>com.oracle.coherence.Order</class-name> <serializer> <class-name>com.oracle.coherence.OrderSerializer</class-name> </serializer> </user-type> <user-type> <type-id>1902</type-id> <class-name>com.oracle.coherence.CustomPublisherScheme</class-name> </user-type> </user-type-list> </pof-config> CONCLUSIONSThis approach allows for publishers to publish data to almost any other receiver (database, JMS, MQ, ...). The only thing that needs to be changed is the DatabasePersistence.java class that should be adapted to the chosen receiver. Only minor changes are needed for the rest of the code (to publishBatch method from CustomPublisher class).

    Read the article

  • Is there any replication standard or concept for application server data replication

    - by Naga
    Hi friends, Say, I have a server that handles file based mass data and can process thousands of read requests and hundreds of provisioning requests(Add, modify, delete) per second. This is not SQL based database. Now i planned to implement replication. There should be master- master replication, master slave replication, partial replication(entries matching a criteria) and fractional replication(part of an entry). Is there any standard for implementing such replication mechanisms? Can you please suggest some solution overview? Thanks, Naga

    Read the article

  • Replication Services in a BI environment

    - by jorg
    In this blog post I will explain the principles of SQL Server Replication Services without too much detail and I will take a look on the BI capabilities that Replication Services could offer in my opinion. SQL Server Replication Services provides tools to copy and distribute database objects from one database system to another and maintain consistency afterwards. These tools basically copy or synchronize data with little or no transformations, they do not offer capabilities to transform data or apply business rules, like ETL tools do. The only “transformations” Replication Services offers is to filter records or columns out of your data set. You can achieve this by selecting the desired columns of a table and/or by using WHERE statements like this: SELECT <published_columns> FROM [Table] WHERE [DateTime] >= getdate() - 60 There are three types of replication: Transactional Replication This type replicates data on a transactional level. The Log Reader Agent reads directly on the transaction log of the source database (Publisher) and clones the transactions to the Distribution Database (Distributor), this database acts as a queue for the destination database (Subscriber). Next, the Distribution Agent moves the cloned transactions that are stored in the Distribution Database to the Subscriber. The Distribution Agent can either run at scheduled intervals or continuously which offers near real-time replication of data! So for example when a user executes an UPDATE statement on one or multiple records in the publisher database, this transaction (not the data itself) is copied to the distribution database and is then also executed on the subscriber. When the Distribution Agent is set to run continuously this process runs all the time and transactions on the publisher are replicated in small batches (near real-time), when it runs on scheduled intervals it executes larger batches of transactions, but the idea is the same. Snapshot Replication This type of replication makes an initial copy of database objects that need to be replicated, this includes the schemas and the data itself. All types of replication must start with a snapshot of the database objects from the Publisher to initialize the Subscriber. Transactional replication need an initial snapshot of the replicated publisher tables/objects to run its cloned transactions on and maintain consistency. The Snapshot Agent copies the schemas of the tables that will be replicated to files that will be stored in the Snapshot Folder which is a normal folder on the file system. When all the schemas are ready, the data itself will be copied from the Publisher to the snapshot folder. The snapshot is generated as a set of bulk copy program (BCP) files. Next, the Distribution Agent moves the snapshot to the Subscriber, if necessary it applies schema changes first and copies the data itself afterwards. The application of schema changes to the Subscriber is a nice feature, when you change the schema of the Publisher with, for example, an ALTER TABLE statement, that change is propagated by default to the Subscriber(s). Merge Replication Merge replication is typically used in server-to-client environments, for example when subscribers need to receive data, make changes offline, and later synchronize changes with the Publisher and other Subscribers, like with mobile devices that need to synchronize one in a while. Because I don’t really see BI capabilities here, I will not explain this type of replication any further. Replication Services in a BI environment Transactional Replication can be very useful in BI environments. In my opinion you never want to see users to run custom (SSRS) reports or PowerPivot solutions directly on your production database, it can slow down the system and can cause deadlocks in the database which can cause errors. Transactional Replication can offer a read-only, near real-time database for reporting purposes with minimal overhead on the source system. Snapshot Replication can also be useful in BI environments, if you don’t need a near real-time copy of the database, you can choose to use this form of replication. Next to an alternative for Transactional Replication it can be used to stage data so it can be transformed and moved into the data warehousing environment afterwards. In many solutions I have seen developers create multiple SSIS packages that simply copies data from one or more source systems to a staging database that figures as source for the ETL process. The creation of these packages takes a lot of (boring) time, while Replication Services can do the same in minutes. It is possible to filter out columns and/or records and it can even apply schema changes automatically so I think it offers enough features here. I don’t know how the performance will be and if it really works as good for this purpose as I expect, but I want to try this out soon!

    Read the article

  • re-enabling a table for mysql replication

    - by jessieE
    We were able to setup mysql master-slave replication with the following version on both master/slave: mysqld Ver 5.5.28-29.1-log for Linux on x86_64 (Percona Server (GPL), Release 29.1) One day, we noticed that replication has stopped, we tried skipping over the entries that caused the replication errors. The errors persisted so we decided to skip replication for the 4 problematic tables. The slave has now caught up with the master except for the 4 tables. What is the best way to enable replication again for the 4 tables? This is what I have in mind but I don't know if it will work: 1) Modify slave config to enable replication again for the 4 tables 2) stop slave replication 3) for each of the 4 tables, use pt-table-sync --execute --verbose --print --sync-to-master h=localhost,D=mydb,t=mytable 4) restart slave database to reload replication configuration 5) start slave replication

    Read the article

  • Manage and Monitor Identity Ranges in SQL Server Transactional Replication

    - by Yaniv Etrogi
    Problem When using transactional replication to replicate data in a one way topology from a publisher to a read-only subscriber(s) there is no need to manage identity ranges. However, when using  transactional replication to replicate data in a two way replication topology - between two or more servers there is a need to manage identity ranges in order to prevent a situation where an INSERT commands fails on a PRIMARY KEY violation error  due to the replicated row being inserted having a value for the identity column which already exists at the destination database. Solution There are two ways to address this situation: Assign a range of identity values per each server. Work with parallel identity values. The first method requires some maintenance while the second method does not and so the scripts provided with this article are very useful for anyone using the first method. I will explore this in more detail later in the article. In the first solution set server1 to work in the range of 1 to 1,000,000,000 and server2 to work in the range of 1,000,000,001 to 2,000,000,000.  The ranges are set and defined using the DBCC CHECKIDENT command and when the ranges in this example are well maintained you meet the goal of preventing the INSERT commands to fall due to a PRIMARY KEY violation. The first insert at server1 will get the identity value of 1, the second insert will get the value of 2 and so on while on server2 the first insert will get the identity value of 1000000001, the second insert 1000000002 and so on thus avoiding a conflict. Be aware that when a row is inserted the identity value (seed) is generated as part of the insert command at each server and the inserted row is replicated. The replicated row includes the identity column’s value so the data remains consistent across all servers but you will be able to tell on what server the original insert took place due the range that  the identity value belongs to. In the second solution you do not manage ranges but enforce a situation in which identity values can never get overlapped by setting the first identity value (seed) and the increment property one time only during the CREATE TABLE command of each table. So a table on server1 looks like this: CREATE TABLE T1 (  c1 int NOT NULL IDENTITY(1, 5) PRIMARY KEY CLUSTERED ,c2 int NOT NULL ); And a table on server2 looks like this: CREATE TABLE T1(  c1 int NOT NULL IDENTITY(2, 5) PRIMARY KEY CLUSTERED ,c2 int NOT NULL ); When these two tables are inserted the results of the identity values look like this: Server1:  1, 6, 11, 16, 21, 26… Server2:  2, 7, 12, 17, 22, 27… This assures no identity values conflicts while leaving a room for 3 additional servers to participate in this same environment. You can go up to 9 servers using this method by setting an increment value of 9 instead of 5 as I used in this example. Continues…

    Read the article

  • mysql replication 1x master, 1x slave

    - by clarkk
    I have just setup one master and one slave server, but its not working.. On my website I connect to the slave server and I insert some rows, but they do not appear on the master and vice versa.. What is wrong? This is what I did: Master: -> /etc/mysql/my.cnf [mysqld] log-bin = mysql-master-bin server-id=1 # bind-address = 127.0.0.1 binlog-do-db = test_db Slave: -> /etc/mysql/my.cnf [mysqld] log-bin = mysql-slave-bin server-id=2 # bind-address = 127.0.0.1 replicate-do-db = test_db Slave: terminal 0 > mysql> STOP SLAVE; // and drop tables Master: terminal 1 > mysql> CREATE USER 'repl_slave'@'slave_ip' IDENTIFIED BY 'repl_pass'; mysql> GRANT REPLICATION SLAVE ON *.* TO 'repl_slave'@'slave_ip'; mysql> FLUSH PRIVILEGES; mysql> FLUSH TABLES WITH READ LOCK; -- leave terminal open terminal 2 > shell> mysqldump -u root -pPASSWORD test_db --lock-all-tables > dump.sql mysql> SHOW MASTER STATUS; Slave: terminal 3 > shell> mysql -u root -pPASSWORD test_db < dump.sql terminal 0 > mysql> CHANGE MASTER TO mysql> MASTER_HOST='master_ip', mysql> MASTER_USER='repl_slave', mysql> MASTER_PASSWORD='repl_pass', mysql> MASTER_PORT=3306, mysql> MASTER_LOG_FILE='mysql-master-bin.000003', // terminal 2 > SHOW MASTER STATUS mysql> MASTER_LOG_POS=4, // terminal 2 > SHOW MASTER STATUS mysql> MASTER_CONNECT_RETRY=10; mysql> START SLAVE; mysql> SHOW SLAVE STATUS; Here is the slave status: Array ( [Slave_IO_State] => Waiting for master to send event [Master_Host] => xx.xx.xx.xx [Master_User] => repl_slave [Master_Port] => 3306 [Connect_Retry] => 10 [Master_Log_File] => mysql-master-bin.000003 [Read_Master_Log_Pos] => 106 [Relay_Log_File] => mysqld-relay-bin.000002 [Relay_Log_Pos] => 258 [Relay_Master_Log_File] => mysql-master-bin.000003 [Slave_IO_Running] => Yes [Slave_SQL_Running] => Yes [Replicate_Do_DB] => test_db [Replicate_Ignore_DB] => [Replicate_Do_Table] => [Replicate_Ignore_Table] => [Replicate_Wild_Do_Table] => [Replicate_Wild_Ignore_Table] => [Last_Errno] => 0 [Last_Error] => [Skip_Counter] => 0 [Exec_Master_Log_Pos] => 106 [Relay_Log_Space] => 414 [Until_Condition] => None [Until_Log_File] => [Until_Log_Pos] => 0 [Master_SSL_Allowed] => No [Master_SSL_CA_File] => [Master_SSL_CA_Path] => [Master_SSL_Cert] => [Master_SSL_Cipher] => [Master_SSL_Key] => [Seconds_Behind_Master] => 0 [Master_SSL_Verify_Server_Cert] => No [Last_IO_Errno] => 0 [Last_IO_Error] => [Last_SQL_Errno] => 0 [Last_SQL_Error] => )

    Read the article

  • Replication Services as ETL extraction tool

    - by jorg
    In my last blog post I explained the principles of Replication Services and the possibilities it offers in a BI environment. One of the possibilities I described was the use of snapshot replication as an ETL extraction tool: “Snapshot Replication can also be useful in BI environments, if you don’t need a near real-time copy of the database, you can choose to use this form of replication. Next to an alternative for Transactional Replication it can be used to stage data so it can be transformed and moved into the data warehousing environment afterwards. In many solutions I have seen developers create multiple SSIS packages that simply copies data from one or more source systems to a staging database that figures as source for the ETL process. The creation of these packages takes a lot of (boring) time, while Replication Services can do the same in minutes. It is possible to filter out columns and/or records and it can even apply schema changes automatically so I think it offers enough features here. I don’t know how the performance will be and if it really works as good for this purpose as I expect, but I want to try this out soon!” Well I have tried it out and I must say it worked well. I was able to let replication services do work in a fraction of the time it would cost me to do the same in SSIS. What I did was the following: Configure snapshot replication for some Adventure Works tables, this was quite simple and straightforward. Create an SSIS package that executes the snapshot replication on demand and waits for its completion. This is something that you can’t do with out of the box functionality. While configuring the snapshot replication two SQL Agent Jobs are created, one for the creation of the snapshot and one for the distribution of the snapshot. Unfortunately these jobs are  asynchronous which means that if you execute them they immediately report back if the job started successfully or not, they do not wait for completion and report its result afterwards. So I had to create an SSIS package that executes the jobs and waits for their completion before the rest of the ETL process continues. Fortunately I was able to create the SSIS package with the desired functionality. I have made a step-by-step guide that will help you configure the snapshot replication and I have uploaded the SSIS package you need to execute it. Configure snapshot replication   The first step is to create a publication on the database you want to replicate. Connect to SQL Server Management Studio and right-click Replication, choose for New.. Publication…   The New Publication Wizard appears, click Next Choose your “source” database and click Next Choose Snapshot publication and click Next   You can now select tables and other objects that you want to publish Expand Tables and select the tables that are needed in your ETL process In the next screen you can add filters on the selected tables which can be very useful. Think about selecting only the last x days of data for example. Its possible to filter out rows and/or columns. In this example I did not apply any filters. Schedule the Snapshot Agent to run at a desired time, by doing this a SQL Agent Job is created which we need to execute from a SSIS package later on. Next you need to set the Security Settings for the Snapshot Agent. Click on the Security Settings button.   In this example I ran the Agent under the SQL Server Agent service account. This is not recommended as a security best practice. Fortunately there is an excellent article on TechNet which tells you exactly how to set up the security for replication services. Read it here and make sure you follow the guidelines!   On the next screen choose to create the publication at the end of the wizard Give the publication a name (SnapshotTest) and complete the wizard   The publication is created and the articles (tables in this case) are added Now the publication is created successfully its time to create a new subscription for this publication.   Expand the Replication folder in SSMS and right click Local Subscriptions, choose New Subscriptions   The New Subscription Wizard appears   Select the publisher on which you just created your publication and select the database and publication (SnapshotTest)   You can now choose where the Distribution Agent should run. If it runs at the distributor (push subscriptions) it causes extra processing overhead. If you use a separate server for your ETL process and databases choose to run each agent at its subscriber (pull subscriptions) to reduce the processing overhead at the distributor. Of course we need a database for the subscription and fortunately the Wizard can create it for you. Choose for New database   Give the database the desired name, set the desired options and click OK You can now add multiple SQL Server Subscribers which is not necessary in this case but can be very useful.   You now need to set the security settings for the Distribution Agent. Click on the …. button Again, in this example I ran the Agent under the SQL Server Agent service account. Read the security best practices here   Click Next   Make sure you create a synchronization job schedule again. This job is also necessary in the SSIS package later on. Initialize the subscription at first synchronization Select the first box to create the subscription when finishing this wizard Complete the wizard by clicking Finish The subscription will be created In SSMS you see a new database is created, the subscriber. There are no tables or other objects in the database available yet because the replication jobs did not ran yet. Now expand the SQL Server Agent, go to Jobs and search for the job that creates the snapshot:   Rename this job to “CreateSnapshot” Now search for the job that distributes the snapshot:   Rename this job to “DistributeSnapshot” Create an SSIS package that executes the snapshot replication We now need an SSIS package that will take care of the execution of both jobs. The CreateSnapshot job needs to execute and finish before the DistributeSnapshot job runs. After the DistributeSnapshot job has started the package needs to wait until its finished before the package execution finishes. The Execute SQL Server Agent Job Task is designed to execute SQL Agent Jobs from SSIS. Unfortunately this SSIS task only executes the job and reports back if the job started succesfully or not, it does not report if the job actually completed with success or failure. This is because these jobs are asynchronous. The SSIS package I’ve created does the following: It runs the CreateSnapshot job It checks every 5 seconds if the job is completed with a for loop When the CreateSnapshot job is completed it starts the DistributeSnapshot job And again it waits until the snapshot is delivered before the package will finish successfully Quite simple and the package is ready to use as standalone extract mechanism. After executing the package the replicated tables are added to the subscriber database and are filled with data:   Download the SSIS package here (SSIS 2008) Conclusion In this example I only replicated 5 tables, I could create a SSIS package that does the same in approximately the same amount of time. But if I replicated all the 70+ AdventureWorks tables I would save a lot of time and boring work! With replication services you also benefit from the feature that schema changes are applied automatically which means your entire extract phase wont break. Because a snapshot is created using the bcp utility (bulk copy) it’s also quite fast, so the performance will be quite good. Disadvantages of using snapshot replication as extraction tool is the limitation on source systems. You can only choose SQL Server or Oracle databases to act as a publisher. So if you plan to build an extract phase for your ETL process that will invoke a lot of tables think about replication services, it would save you a lot of time and thanks to the Extract SSIS package I’ve created you can perfectly fit it in your usual SSIS ETL process.

    Read the article

  • MySQL Connect 8 Days Away - Replication Sessions

    - by Mat Keep
    Following on from my post about MySQL Cluster sessions at the forthcoming Connect conference, its now the turn of MySQL Replication - another technology at the heart of scaling and high availability for MySQL. Unless you've only just returned from a 6-month alien abduction, you will know that MySQL 5.6 includes the largest set of replication enhancements ever packaged into a single new release: - Global Transaction IDs + HA utilities for self-healing cluster..(yes both automatic failover and manual switchover available!) - Crash-safe slaves and binlog - Binlog Group Commit and Multi-Threaded Slaves for high performance - Replication Event Checksums and Time-Delayed replication - and many more There are a number of sessions dedicated to learn more about these important new enhancements, delivered by the same engineers who developed them. Here is a summary Saturday 29th, 13.00 Replication Tips and Tricks, Mats Kindahl In this session, the developers of MySQL Replication present a bag of useful tips and tricks related to the MySQL 5.5 GA and MySQL 5.6 development milestone releases, including multisource replication, using logs for auditing, handling filtering, examining the binary log, using relay slaves, splitting the replication stream, and handling failover. Saturday 29th, 17.30 Enabling the New Generation of Web and Cloud Services with MySQL 5.6 Replication, Lars Thalmann This session showcases the new replication features, including • High performance (group commit, multithreaded slave) • High availability (crash-safe slaves, failover utilities) • Flexibility and usability (global transaction identifiers, annotated row-based replication [RBR]) • Data integrity (event checksums) Saturday 29th, 1900 MySQL Replication Birds of a Feather In this session, the MySQL Replication engineers discuss all the goodies, including global transaction identifiers (GTIDs) with autofailover; multithreaded, crash-safe slaves; checksums; and more. The team discusses the design behind these enhancements and how to get started with them. You will get the opportunity to present your feedback on how these can be further enhanced and can share any additional replication requirements you have to further scale your critical MySQL-based workloads. Sunday 30th, 10.15 Hands-On Lab, MySQL Replication, Luis Soares and Sven Sandberg But how do you get started, how does it work, and what are the best practices and tools? During this hands-on lab, you will learn how to get started with replication, how it works, architecture, replication prerequisites, setting up a simple topology, and advanced replication configurations. The session also covers some of the new features in the MySQL 5.6 development milestone releases. Sunday 30th, 13.15 Hands-On Lab, MySQL Utilities, Chuck Bell Would you like to learn how to more effectively manage a host of MySQL servers and manage high-availability features such as replication? This hands-on lab addresses these areas and more. Participants will get familiar with all of the MySQL utilities, using each of them with a variety of options to configure and manage MySQL servers. Sunday 30th, 14.45 Eliminating Downtime with MySQL Replication, Luis Soares The presentation takes a deep dive into new replication features such as global transaction identifiers and crash-safe slaves. It also showcases a range of Python utilities that, combined with the Release 5.6 feature set, results in a self-healing data infrastructure. By the end of the session, attendees will be familiar with the new high-availability features in the whole MySQL 5.6 release and how to make use of them to protect and grow their business. Sunday 30th, 17.45 Scaling for the Web and the Cloud with MySQL Replication, Luis Soares In a Replication topology, high performance directly translates into improving read consistency from slaves and reducing the risk of data loss if a master fails. MySQL 5.6 introduces several new replication features to enhance performance. In this session, you will learn about these new features, how they work, and how you can leverage them in your applications. In addition, you will learn about some other best practices that can be used to improve performance. So how can you make sure you don't miss out - the good news is that registration is still open ;-) And just to whet your appetite, listen to the On-Demand webinar that presents an overview of MySQL 5.6 Replication.  

    Read the article

  • PostgreSQL 9.1 Database Replication Between Two Production Environments with Load Balancer

    - by littleK
    I'm investigating different solutions for database replication between two PostgreSQL 9.1 databases. The setup will include two production servers on the cloud (Amazon EC2 X-Large Instances), with an elastic load balancer. What is the typical database implementation for for this type of setup? A master-master replication (with Bucardo or rubyrep)? Or perhaps use only one shared database between the two environments, with a shared disk failover? I've been getting some ideas from http://www.postgresql.org/docs/9.0/static/different-replication-solutions.html. Since I don't have a lot of experience in database replication, I figured I would ask the experts. What would you recommend for the described setup?

    Read the article

  • <solved> MySQL Replication A->B->C

    - by nonus25
    I was setting the MySQL Replication for master - slave/master - slave and Replication for master - slave its works fine but when i have enable this option in my.cnf log-slave-updates=1 for updating the master bin log my replications is starting be slower and the time Seconds_Behind_Master is growing. I use innodb engine but the DB is big. Any idea how i can improve the replication, looks like the network is not the issue. Also i was think to use binlog_format=ROW but master is using default setting for replication 'statement' and i cant reset master ;) Thanks ...

    Read the article

  • SQL 2008 R2 replication error: The process could not connect to Distributor

    - by Lance Lefebure
    I have two servers running SQL 2008 R2 Standard, each with an instance named "MAIN". I have a small test database on my primary server (one table, 13 rows) that I want to replicate to a second server as a proof-of-concept for some larger databases that I want to replicate. I set up the primary server to be a publisher and distributor, and set the database to do transactional replication. I copied the data to the second server via a backup/restore, not via a snapshot (which I'll have to do with the larger databases due to database size and limited bandwidth). I followed the instructions here: http://gnawgnu.blogspot.com/2009/11/sql-2008-transactional-replication-and.html Now on the subscriber, I go under Replication / Local Subscriptions / Right click / Properties on my subscription to the DB. The status of the last synchronization shows a status of: "The process could not connect to Distributor 'PRIMARYSERVER\MAIN'." Data IS replicating from the primary to the secondary. Any record I add on the primary shows up on the secondary server within seconds. Is the Distributor part of the Snapshot system that I'm not using, or is it part of the transaction replication stuff? Thanks, Lance

    Read the article

  • why use mixed-based replication for mysql

    - by Alistair Prestidge
    I am in the process of configuring MySQL replication and am intending to use row-based-replication but I was also reading up about mixed-based replication. This is where statement-based is the default and then for certain circumstances (http://dev.mysql.com/doc/refman/5.1/en/binary-log-mixed.html) MySQL will switch to row-based. The list is quit vast on when it will switch to row-based. My questions are: Does any one use mixed? If yes why did you chose this over just using one or the other? Thanks in advance

    Read the article

  • MySQL Database Replication and Server Load

    - by Willy
    Hi Everyone, I have an online service with around 5000 MySQL databases. Now, I am interested in building a development area of the exactly same environment in my office, therefore, I am about to setup MySQL replication between my live MySQL server and development MySQL server. But my concern is the load which will occur on my live MySQL server once replication is started. Do you have any experience? Will this process cause extra load on my production server? Thanks, have a nice weekend.

    Read the article

  • Monitor SQL Server Replication Jobs

    - by Yaniv Etrogi
    The Replication infrastructure in SQL Server is implemented using SQL Server Agent to execute the various components involved in the form of a job (e.g. LogReader agent job, Distribution agent job, Merge agent job) SQL Server jobs execute a binary executable file which is basically C++ code. You can download all the scripts for this article here SQL Server Job Schedules By default each of job has only one schedule that is set to Start automatically when SQL Server Agent starts. This schedule ensures that when ever the SQL Server Agent service is started all the replication components are also put into action. This is OK and makes sense but there is one problem with this default configuration that needs improvement  -  if for any reason one of the components fails it remains down in a stopped state.   Unless you monitor the status of each component you will typically get to know about such a failure from a customer complaint as a result of missing data or data that is not up to date at the subscriber level. Furthermore, having any of these components in a stopped state can lead to more severe problems if not corrected within a short time. The action required to improve on this default settings is in fact very simple. Adding a second schedule that is set as a Daily Reoccurring schedule which runs every 1 minute does the trick. SQL Server Agent’s scheduler module knows how to handle overlapping schedules so if the job is already being executed by another schedule it will not get executed again at the same time. So, in the event of a failure the failed job remains down for at most 60 seconds. Many DBAs are not aware of this capability and so search for more complex solutions such as having an additional dedicated job running an external code in VBS or another scripting language that detects replication jobs in a stopped state and starts them but there is no need to seek such external solutions when what is needed can be accomplished by T-SQL code. SQL Server Jobs Status In addition to the 1 minute schedule we also want to ensure that key components in the replication are enabled so I can search for those components by their Category, and set their status to enabled in case they are disabled, by executing the stored procedure MonitorEnableReplicationAgents. The jobs that I typically have handled are listed below but you may want to extend this, so below is the query to return all jobs along with their category. SELECT category_id, name FROM msdb.dbo.syscategories ORDER BY category_id; Distribution Cleanup LogReader Agent Distribution Agent Snapshot Agent Jobs By default when a publication is created, a snapshot agent job also gets created with a daily schedule. I see more organizations where the snapshot agent job does not need to be executed automatically by the SQL Server Agent  scheduler than organizations who   need a new snapshot generated automatically. To assure this setting is in place I created the stored procedure MonitorSnapshotAgentsSchedules which disables snapshot agent jobs and also deletes the job schedule. It is worth mentioning that when the publication property immediate_sync is turned off then the snapshot files are not created when the Snapshot agent is executed by the job. You control this property when the publication is created with a parameter called @immediate_sync passed to sp_addpublication and for an existing publication you can use sp_changepublication. Implementation The scripts assume the existence of a database named PerfDB. Steps: Run the scripts to create the stored procedures in the PerfDB database. Create a job that executes the stored procedures every hour. -- Verify that the 1_Minute schedule exists. EXEC PerfDB.dbo.MonitorReplicationAgentsSchedules @CategoryId = 10; /* Distribution */ EXEC PerfDB.dbo.MonitorReplicationAgentsSchedules @CategoryId = 13; /* LogReader */ -- Verify all replication agents are enabled. EXEC PerfDB.dbo.MonitorEnableReplicationAgents @CategoryId = 10; /* Distribution */ EXEC PerfDB.dbo.MonitorEnableReplicationAgents @CategoryId = 13; /* LogReader */ EXEC PerfDB.dbo.MonitorEnableReplicationAgents @CategoryId = 11; /* Distribution clean up */ -- Verify that Snapshot agents are disabled and have no schedule EXEC PerfDB.dbo.MonitorSnapshotAgentsSchedules; Want to read more of about replication? Check at my replication posts at my blog.

    Read the article

  • Alter Stored Procedure in SQL Replication

    - by Refracted Paladin
    How do I, properly, ALTER a StoredProcedure in a SQL 2005 Merge Replication? I just need to add a Column. I already successfully added it to the Table and I now need to add it to a SP. I did so but now it will not synchronize with the following error -- Insert Error: Column name or number of supplied values does not match table definition. (Source: MSSQLServer, Error number: 213)

    Read the article

  • SQL Server 2008 - Performance impact of transactional replication?

    - by cxfx
    I'm planning to set up transactional replication for a 100Gb SQL Server 2008 database. I have the distributor and publisher on the same server, and am using push subscription. Should there be a performance impact on my publisher server when it creates the initial snapshot, and synchronises it with a subscriber? From what I've tried so far on a staging server, it seems to slow right down. Is there a better way to create the initial snapshot without impacting my production publisher server?

    Read the article

  • Table modifications while running db replication (MS SQL 2008)

    - by typemismatch
    I'm running SQL Server 2008 Std with a database that is being published in a "Transactional Publication" to a single subscriber. We are unable to make any changes to the tables on the publisher without getting the "cannot modify table because it is published for replication". This seems odd because schema changes (or scripts run to do this) should be pushed to the subscriber. We currently have to drop the entire publication system to make table changes. What am I missing? There must be a way to update the publisher tables? thanks!

    Read the article

  • SQL Merge Replication - Filter Sets

    - by Refracted Paladin
    I have a "working" Replication Set in SQL 2005 that we use in house to our users at remote branches on SQL Express 2005. I want to apply a filter to our biggest Set to help minimize the bandwidth impact. What I am asking is what considerations do I need to take into account before throwing a filter on there. Will it cause any issues I should be aware of? Does it affect compression adversely. Will everyone need to reinitialize after applying it? Any heads up or insight would be appreciated. Thanks,

    Read the article

  • master-slave-slave replication: master will become bottleneck for writes

    - by JMW
    hi, the mysql database has arround 2TB of data. i have a master-slave-slave replication running. the application that uses the database does read (SELECT) queries just on one of the 2 slaves and write (DELETE/INSERT/UPDATE) queries on the master. the application does way more reads, than writes. if we have a problem with the read (SELECT) queries, we can just add another slave database and tell the application, that there is another salve. so it scales well... Currently, the master is running arround 40% disk io due to the writes. So i'm thinking about how to scale the the database in the future. Because one day the master will be overloaded. What could be a solution there? maybe mysql cluster? if so, are there any pitfalls or limitations in switching the database to ndb? thanks a lot in advance... :)

    Read the article

  • MySQL replication - rapidly growing relay bin logs

    - by Rob Forrest
    Morning all, I've got a really strange situation here this morning much like a reportedly fixed MySQL bug. http://bugs.mysql.com/bug.php?id=28421 My relay bin logs are rapidly filling with an infinite loop of junk made of this sort of thing. #121018 5:40:04 server id 101 end_log_pos 15598207 #Append_block: file_id: 2244 block_len: 8192 # at 15598352 #121018 5:40:04 server id 101 end_log_pos 15606422 #Append_block: file_id: 2244 block_len: 8192 # at 15606567 ... # at 7163731 #121018 5:38:39 server id 101 end_log_pos 7171801 #Append_block: file_id: 2243 block_len: 8192 WARNING: Ignoring Append_block as there is no Create_file event for file_id: 2243 # at 7171946 #121018 5:38:39 server id 101 end_log_pos 7180016 #Append_block: file_id: 2243 block_len: 8192 WARNING: Ignoring Append_block as there is no Create_file event for file_id: 2243 These log files grow to 1Gb within about a minute before rotating and starting again. These big files are interspersed with 1 or 2 smaller files with just this in /*!40019 SET @@session.max_insert_delayed_threads=0*/; /*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/; DELIMITER /*!*/; # at 4 #121023 9:43:05 server id 100 end_log_pos 106 Start: binlog v 4, server v 5.1.61-log created 121023 9:43:05 BINLOG ' mViGUA9kAAAAZgAAAGoAAAAAAAQANS4xLjYxLWxvZwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAEzgNAAgAEgAEBAQEEgAAUwAEGggAAAAICAgC '/*!*/; # at 106 #121023 9:43:05 server id 100 end_log_pos 156 Rotate to mysqld-relay-bin.000003 pos: 4 DELIMITER ; # End of log file ROLLBACK /* added by mysqlbinlog */; /*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/; We're running a master-master replication setup with the problematic server running mysql 5.1.61. The other server which is, for the moment, stable is running 5.1.58. Has anyone got any ideas what the solution is to this and moreover, what might have caused this?

    Read the article

  • Synchronizing in SQL Replication works when manually syncing, but not automatically

    - by Dominic Zukiewicz
    I'm using SQL Server 2005 to create a replication copy of the main databases, so that the reports can point to the replication copy instead of locking out our main databases. I have set up the 3 databases as publications and then 3 subscribers moving the transactions over to the subscribers, instantaneously I hope! What seems to be happening is that when using the "Insert Tracer" function, replication take publisher to distributor < 2 seconds, but to replicate to the subscribers can take over 7 minutes (and these are local databases on a SAN). This could be for 2 reasons: The SQL statements used to query the database are obtaining locks which are stopping the transactions updating the subscribers. The subscribers are just too busy for the replication to apply the changes. What seems to trouble me more, is that although the Replication Monitor / Insert Tracer are showing these statistics, if you use the "View Subscription Details" and then click Start, it will sync within seconds. My goal would be to have the data syncing (ideally) continuously, or every minute, perhaps I should reduce the batch size of the transactions? What am I doing wrong? [Note that the -Continuous flag is set!]

    Read the article

  • SQL Replication (subsciber) can't connect to publication

    - by a3code
    I have 2 Virtual Machines, One with MS SQL server 2008 R2, other with MS SQL Server 2012 Express.... On 1 I have configuration for replication (publication), and I would like to setup Express version like subscriber. but I can't to connect to publisher SQL Server replication requires the actual server name to make a connection to the server. Specify the actual server name, 'XXXX'. (Replication.Utilities) I have tried to cheat and added XXXX server name to hosts file, but it dos't help. Additianlly I used to run http://www.hagrin.com/332/fixing-sql-server-replication-requires-actual-server-name-make-connection-server-error action for setup publication in correct way What I need to do for successful connection ?

    Read the article

  • Adding FK Index to existing table in Merge Replication Topology

    - by Refracted Paladin
    I have a table that has grown quite large that we are replicating to about 120 subscribers. A FK on that table does not have an index and when I ran an Execution Plan on a query that was causing issues it had this to say -- /* Missing Index Details from CaseNotesTimeoutQuerys.sql - mylocal\sqlexpress.MATRIX (WWCARES\pschaller (54)) The Query Processor estimates that implementing the following index could improve the query cost by 99.5556%. */ /* USE [MATRIX] GO CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>] ON [dbo].[tblCaseNotes] ([PersonID]) GO */ I would like to add this but I am afraid it will FORCE a reinitialization. Can anyone verify or validate my concerns? Does it even work that way or would I need to run the script on each subscriber? Any insight would be appreciated.

    Read the article

  • SQL Server 2008 R2 Replication log reader could not execute sp_replcmds

    - by user49352
    This log reader agent worked perfectly for several months until the user referenced in the error was removed from the domain. After that time the error 'The process could not execute 'sp_replcmds' on 'SERVER'' was received with more detail 'Could not obtain information about Windows NT group/user' that referenced said user. This user was referenced nowhere in the the log reader agent other than the Publication Access List from which it was subsequenctly removed. The agent would still not successfully start up. The simple problem here, I believe, is that the log reader agent was created under that user and that no longer exists in the domain. Is there an 'owner' somewhere that needs to be changed? Every other function on the database continues to execute successfully. Any other help or thought would be appreciated.

    Read the article

  • How to prevent bad queries from breaking replication?

    - by nulll
    For my personal experience, mysql replication is fragile. I know that there area many things not to do beacuse they could break replication, but we are humans and the error could always occur. So I was thinking... is there a way to enforce mysql replication? Something that prevents queries that are dangerous for the replication to be runt? In other words I'm searching for something that saves replication even if I accidentally run evil queries.

    Read the article

1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >