orientdb: Find files broken while write data to orientdb 2.2.30

OrientDB Version: 2.2.30

Java Version: 1.8

OS: centos 7

Expected behavior

Caused by: com.orientechnologies.orient.core.exception.OPageIsBrokenException: Following files and pages are detected to be broken ['ner_16.pcl' :138;], storage is switched to 'read only' mode. Any modification operations are prohibited. Typically it means hardware error, before filling a bug please check your hardware. To restore database and make it fully operational you may export and import database to and from JSON.
	DB name="db_skynet"
	at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.checkLowDiskSpaceRequestsAndReadOnlyConditions(OAbstractPaginatedStorage.java:5130)
	at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.createRecord(OAbstractPaginatedStorage.java:1342)
	at com.orientechnologies.orient.server.distributed.impl.ODistributedStorage$5.call(ODistributedStorage.java:644)
	at com.orientechnologies.orient.server.distributed.impl.ODistributedStorage$5.call(ODistributedStorage.java:638)
	at com.orientechnologies.orient.server.distributed.impl.ODistributedStorage$11.call(ODistributedStorage.java:1220)
	at com.orientechnologies.orient.core.db.OScenarioThreadLocal.executeAsDistributed(OScenarioThreadLocal.java:70)
	at com.orientechnologies.orient.server.distributed.impl.ODistributedStorage.executeRecordOperationInLock(ODistributedStorage.java:1217)
	at com.orientechnologies.orient.server.distributed.impl.ODistributedStorage.createRecord(ODistributedStorage.java:637)
	at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.executeSaveRecord(ODatabaseDocumentTx.java:2216)
	at com.orientechnologies.orient.core.tx.OTransactionNoTx.saveNew(OTransactionNoTx.java:241)
	at com.orientechnologies.orient.core.tx.OTransactionNoTx.saveRecord(OTransactionNoTx.java:171)
	at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.save(ODatabaseDocumentTx.java:2782)
	at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.save(ODatabaseDocumentTx.java:2662)
	at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.save(ODatabaseDocumentTx.java:103)
	at com.orientechnologies.orient.server.network.protocol.binary.ONetworkProtocolBinary.createRecord(ONetworkProtocolBinary.java:2844)
	at com.orientechnologies.orient.server.network.protocol.binary.ONetworkProtocolBinary.createRecord(ONetworkProtocolBinary.java:1832)
	at com.orientechnologies.orient.server.network.protocol.binary.ONetworkProtocolBinary.executeRequest(ONetworkProtocolBinary.java:629)
	at com.orientechnologies.orient.server.network.protocol.binary.ONetworkProtocolBinary.sessionRequest(ONetworkProtocolBinary.java:398)
	at com.orientechnologies.orient.server.network.protocol.binary.ONetworkProtocolBinary.execute(ONetworkProtocolBinary.java:217)
	at com.orientechnologies.common.thread.OSoftThread.run(OSoftThread.java:82)
2018-01-19_17:22:44 [Executor task launch worker-1] WARN orientdb.graph.dao.impl.GraphDaoImpl:215: renew and try again! tryCount: 1
2018-01-19_17:22:44 [Executor task launch worker-1] ERROR orientechnologies.orient.client.binary.OChannelBinaryAsynchClient:143: Error during exception deserialization
java.lang.NoSuchMethodException: com.orientechnologies.orient.core.exception.OPageIsBrokenException.<init>(com.orientechnologies.orient.core.exception.OPageIsBrokenException)

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Comments: 40 (16 by maintainers)

Most upvoted comments

We ended up migrating to MongoDB 🙈

@laa I managed to use the repair instructions with the build that you provided. However, it looks that for me it did not repair anything.

orientdb {db = "test-db"} > repair database --fix-cluster

Repairing database...
Repair database complete (0 errors)
orientdb {db = "test-db"} > repair database --fix-links

Repairing database...
- Removing broken links...
-- Done! Fixed links: 0, modified documents: 0
Repair database complete (0 errors)

Hi, it will be released in the new version of 2.2.x soon. But here is build which already contains this tool. https://drive.google.com/file/d/1cuZL4qZ6OKHLSgelryOWI8wiqjnsT7AA/view?usp=sharing

You need to do the following:

  1. Open the database in embedded mode using this build.
  2. In the console, execute command repair database --fix-cluster.
  3. Close the console and open again.
  4. In the console, execute command repair database --fix-links
  5. Close console, and open database as usual.

Please send feedback how does it work. @devsprint if you will experience any issues with broken pages after that, would be cool if you send me the log of the server since the execution of this repair.

@laa it is reproducible on 2.2.32 version too. I did not found the root cause, but it happens every few days. Export and Import solution is not viable for a production environment. Is there any progress on this issue?

Hi guys This durability issue is fixed in 3.0.13 but to get advantage from this fix you need to either create database from scratch and use it or export data to json and then import them back. Please let me know if that fix given issue unfortunately to fix this issue on 100% we need to rewrite architecture under neath so I do suggest you to migrate to the 3.0.13 version.

Hi @laa, thank you very much for looking into this serious issue.

My company is also running into it, and we were wondering what the cause of the problem is but couldn’t find anything in the changelog for version 3.0.13 or any linked PR for the fix. We are preparing to migrate to OrientDB 3.0, but as our production cluster is going to be on 2.2.37 in the meantime, we’d like to try to mitigate and minimize the impact of the problem as much as possible.

After all, it’s really difficult for us to focus on migrating to OrientDB 3.0 when our existing database is on fire on a regular basis. Any help or suggestions you are able to provide would be very much appreciated.

I upgraded the specs of the VM we were running it on, gave it another CPU.

Can’t say for sure if it’s fixed but I haven’t had the issue yet since then

@devsprint it is already released. @acuciureanu sorry for delay will try to finish it today.

The reason for this issue is that server is crashed and partially restored or an old version of DB is used and as result page was broken. We will port part of our durability fixes applied in 3.0 to 2.2.x to prevent this issue. But if data on a page is broken, none of the software updates will fix that. For commercial support, I would propose you to send the database to us so we will fix it in one day. But for community support, the only mean now is to perform export/import of the database. The database in read-only mode also can be exported to the JSON, it does not prevent such export. Anyway let me think, maybe I will find a way to isolate page in the cluster so you will not access it and that will prevent the issue in production. I will update you today with a solution which I come.

Method to restore is is stated in exception “To restore database and make it fully operational you may export and import database to and from JSON”