OrientDB Version: 2.2.30
Java Version: 1.8
OS: centos 7
Expected behavior
Caused by: com.orientechnologies.orient.core.exception.OPageIsBrokenException: Following files and pages are detected to be broken ['ner_16.pcl' :138;], storage is switched to 'read only' mode. Any modification operations are prohibited. Typically it means hardware error, before filling a bug please check your hardware. To restore database and make it fully operational you may export and import database to and from JSON.
DB name="db_skynet"
at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.checkLowDiskSpaceRequestsAndReadOnlyConditions(OAbstractPaginatedStorage.java:5130)
at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.createRecord(OAbstractPaginatedStorage.java:1342)
at com.orientechnologies.orient.server.distributed.impl.ODistributedStorage$5.call(ODistributedStorage.java:644)
at com.orientechnologies.orient.server.distributed.impl.ODistributedStorage$5.call(ODistributedStorage.java:638)
at com.orientechnologies.orient.server.distributed.impl.ODistributedStorage$11.call(ODistributedStorage.java:1220)
at com.orientechnologies.orient.core.db.OScenarioThreadLocal.executeAsDistributed(OScenarioThreadLocal.java:70)
at com.orientechnologies.orient.server.distributed.impl.ODistributedStorage.executeRecordOperationInLock(ODistributedStorage.java:1217)
at com.orientechnologies.orient.server.distributed.impl.ODistributedStorage.createRecord(ODistributedStorage.java:637)
at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.executeSaveRecord(ODatabaseDocumentTx.java:2216)
at com.orientechnologies.orient.core.tx.OTransactionNoTx.saveNew(OTransactionNoTx.java:241)
at com.orientechnologies.orient.core.tx.OTransactionNoTx.saveRecord(OTransactionNoTx.java:171)
at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.save(ODatabaseDocumentTx.java:2782)
at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.save(ODatabaseDocumentTx.java:2662)
at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.save(ODatabaseDocumentTx.java:103)
at com.orientechnologies.orient.server.network.protocol.binary.ONetworkProtocolBinary.createRecord(ONetworkProtocolBinary.java:2844)
at com.orientechnologies.orient.server.network.protocol.binary.ONetworkProtocolBinary.createRecord(ONetworkProtocolBinary.java:1832)
at com.orientechnologies.orient.server.network.protocol.binary.ONetworkProtocolBinary.executeRequest(ONetworkProtocolBinary.java:629)
at com.orientechnologies.orient.server.network.protocol.binary.ONetworkProtocolBinary.sessionRequest(ONetworkProtocolBinary.java:398)
at com.orientechnologies.orient.server.network.protocol.binary.ONetworkProtocolBinary.execute(ONetworkProtocolBinary.java:217)
at com.orientechnologies.common.thread.OSoftThread.run(OSoftThread.java:82)
2018-01-19_17:22:44 [Executor task launch worker-1] WARN orientdb.graph.dao.impl.GraphDaoImpl:215: renew and try again! tryCount: 1
2018-01-19_17:22:44 [Executor task launch worker-1] ERROR orientechnologies.orient.client.binary.OChannelBinaryAsynchClient:143: Error during exception deserialization
java.lang.NoSuchMethodException: com.orientechnologies.orient.core.exception.OPageIsBrokenException.<init>(com.orientechnologies.orient.core.exception.OPageIsBrokenException)
We ended up migrating to MongoDB 🙈
@laa I managed to use the repair instructions with the build that you provided. However, it looks that for me it did not repair anything.
Hi, it will be released in the new version of 2.2.x soon. But here is build which already contains this tool. https://drive.google.com/file/d/1cuZL4qZ6OKHLSgelryOWI8wiqjnsT7AA/view?usp=sharing
You need to do the following:
repair database --fix-cluster.repair database --fix-linksPlease send feedback how does it work. @devsprint if you will experience any issues with broken pages after that, would be cool if you send me the log of the server since the execution of this repair.
@laa it is reproducible on 2.2.32 version too. I did not found the root cause, but it happens every few days. Export and Import solution is not viable for a production environment. Is there any progress on this issue?
Hi @laa, thank you very much for looking into this serious issue.
My company is also running into it, and we were wondering what the cause of the problem is but couldn’t find anything in the changelog for version 3.0.13 or any linked PR for the fix. We are preparing to migrate to OrientDB 3.0, but as our production cluster is going to be on 2.2.37 in the meantime, we’d like to try to mitigate and minimize the impact of the problem as much as possible.
After all, it’s really difficult for us to focus on migrating to OrientDB 3.0 when our existing database is on fire on a regular basis. Any help or suggestions you are able to provide would be very much appreciated.
I upgraded the specs of the VM we were running it on, gave it another CPU.
Can’t say for sure if it’s fixed but I haven’t had the issue yet since then
@devsprint it is already released. @acuciureanu sorry for delay will try to finish it today.
The reason for this issue is that server is crashed and partially restored or an old version of DB is used and as result page was broken. We will port part of our durability fixes applied in 3.0 to 2.2.x to prevent this issue. But if data on a page is broken, none of the software updates will fix that. For commercial support, I would propose you to send the database to us so we will fix it in one day. But for community support, the only mean now is to perform export/import of the database. The database in read-only mode also can be exported to the JSON, it does not prevent such export. Anyway let me think, maybe I will find a way to isolate page in the cluster so you will not access it and that will prevent the issue in production. I will update you today with a solution which I come.
Method to restore is is stated in exception “To restore database and make it fully operational you may export and import database to and from JSON”