대량의 테이블에 100 개 삽입하려고합니다 (mySQL과 함께 사용하는 것이 가장 좋은 크기라고 들었습니다), scala 2.10.4를 sbt 0.13.6과 함께 사용합니다. 내가 사용하고 JDBC 프레임 워크는 Hikaricp와 scalikejdbc입니다, 내 연결 설정은 다음과 같이 :scalikejdbc의 일괄 처리 삽입이 원격 컴퓨터에서 느림
val dataSource: DataSource = {
val ds = new HikariDataSource()
ds.setDataSourceClassName("com.mysql.jdbc.jdbc2.optional.MysqlDataSource");
ds.addDataSourceProperty("url", "jdbc:mysql://" + org.Server.GlobalSettings.DB.mySQLIP + ":3306?rewriteBatchedStatements=true")
ds.addDataSourceProperty("autoCommit", "false")
ds.addDataSourceProperty("user", "someUser")
ds.addDataSourceProperty("password", "not my password")
ds
}
ConnectionPool.add('review, new DataSourceConnectionPool(dataSource))
삽입 코드 :
try {
implicit val session = AutoSession
val paramList: scala.collection.mutable.ListBuffer[Seq[(Symbol, Any)]] = scala.collection.mutable.ListBuffer[Seq[(Symbol, Any)]]()
.
.
.
for(rev<reviews){
paramList += Seq[(Symbol, Any)](
'review_id -> rev.review_idx,
'text -> rev.text,
'category_id -> rev.category_id,
'aspect_id -> aspectId,
'not_aspect -> noAspect /*0*/ ,
'certainty_aspect -> rev.certainty_aspect,
'sentiment -> rev.sentiment,
'sentiment_grade -> rev.certainty_sentiment,
'stars -> rev.stars
)
}
.
.
.
try {
if (paramList != null && paramList.length > 0) {
val result = NamedDB('review) localTx { implicit session =>
sql"""INSERT INTO `MasterFlow`.`classifier_results`
(
`review_id`,
`text`,
`category_id`,
`aspect_id`,
`not_aspect`,
`certainty_aspect`,
`sentiment`,
`sentiment_grade`,
`stars`)
VALUES
({review_id}, {text}, {category_id}, {aspect_id},
{not_aspect}, {certainty_aspect}, {sentiment}, {sentiment_grade}, {stars})
"""
.batchByName(paramList.toIndexedSeq: _*)/*.__resultOfEnsuring*/
.apply()
}
내가 그것을 15 초 걸렸 배치를 넣을 때마다, 내 로그 :
29/10/2014 14:03:36 - DEBUG[Hikari Housekeeping Timer (pool HikariPool-0)] HikariPool - Before cleanup pool stats HikariPool-0 (total=10, inUse=1, avail=9, waiting=0)
29/10/2014 14:03:36 - DEBUG[Hikari Housekeeping Timer (pool HikariPool-0)] HikariPool - After cleanup pool stats HikariPool-0 (total=10, inUse=1, avail=9, waiting=0)
29/10/2014 14:03:46 - DEBUG[default-akka.actor.default-dispatcher-3] StatementExecutor$$anon$1 - SQL execution completed
[SQL Execution]
INSERT INTO `MasterFlow`.`classifier_results` (`review_id`, `text`, `category_id`, `aspect_id`, `not_aspect`, `certainty_aspect`, `sentiment`, `sentiment_grade`, `stars`) VALUES (...can't show this....);
INSERT INTO `MasterFlow`.`classifier_results` (`review_id`, `text`, `category_id`, `aspect_id`, `not_aspect`, `certainty_aspect`, `sentiment`, `sentiment_grade`, `stars`) VALUES (...can't show this....);
.
.
.
INSERT INTO `MasterFlow`.`classifier_results` (`review_id`, `text`, `category_id`, `aspect_id`, `not_aspect`, `certainty_aspect`, `sentiment`, `sentiment_grade`, `stars`) VALUES (...can't show this....);
... (total: 100 times); (15466 ms)
[Stack Trace]
...
logic.DB.ClassifierJsonToDB$$anonfun$1.apply(ClassifierJsonToDB.scala:119)
logic.DB.ClassifierJsonToDB$$anonfun$1.apply(ClassifierJsonToDB.scala:96)
scalikejdbc.DBConnection$$anonfun$_localTx$1$1.apply(DBConnection.scala:252)
scala.util.control.Exception$Catch.apply(Exception.scala:102)
scalikejdbc.DBConnection$class._localTx$1(DBConnection.scala:250)
scalikejdbc.DBConnection$$anonfun$localTx$1.apply(DBConnection.scala:257)
scalikejdbc.DBConnection$$anonfun$localTx$1.apply(DBConnection.scala:257)
scalikejdbc.LoanPattern$class.using(LoanPattern.scala:33)
scalikejdbc.NamedDB.using(NamedDB.scala:32)
scalikejdbc.DBConnection$class.localTx(DBConnection.scala:257)
scalikejdbc.NamedDB.localTx(NamedDB.scala:32)
logic.DB.ClassifierJsonToDB$.insertBulk(ClassifierJsonToDB.scala:96)
logic.DB.ClassifierJsonToDB$$anonfun$bulkInsert$1.apply(ClassifierJsonToDB.scala:176)
logic.DB.ClassifierJsonToDB$$anonfun$bulkInsert$1.apply(ClassifierJsonToDB.scala:167)
scala.collection.Iterator$class.foreach(Iterator.scala:727)
...
mySQL 데이터베이스를 호스트하는 서버에서 실행하면 빠르게 실행됩니다. 원격 컴퓨터에서 더 빨리 실행할 수 있도록하려면 어떻게해야합니까?
HikariCP에서 자동 커밋을 해제하고 scalikejdbc가 커밋을 처리하도록 시도해보십시오. – brettw