We encounter random errors in our project that are hard to reproduce but occur in production. These errors resemble those in #963, #965, and others like
System.InvalidCastException: Unable to cast object of type 'FirebirdSql.Data.Client.Managed.FetchResponse' to type 'FirebirdSql.Data.Client.Managed.GenericResponse'.
at FirebirdSql.Data.Client.Managed.Version10.GdsTransaction.BeginTransactionAsync(TransactionParameterBuffer tpb, CancellationToken cancellationToken)
at FirebirdSql.Data.Client.Managed.Version10.GdsDatabase.BeginTransactionAsync(TransactionParameterBuffer tpb, CancellationToken cancellationToken)
at FirebirdSql.Data.FirebirdClient.FbTransaction.BeginTransactionAsync(CancellationToken cancellationToken)
at FirebirdSql.Data.FirebirdClient.FbCommand.PrepareAsync(Boolean returnsSet, CancellationToken cancellationToken)
at FirebirdSql.Data.FirebirdClient.FbCommand.ExecuteCommandAsync(CommandBehavior behavior, Boolean returnsSet, CancellationToken cancellationToken)
at FirebirdSql.Data.FirebirdClient.FbCommand.ExecuteReaderAsync(CommandBehavior behavior, CancellationToken cancellationToken)
at FirebirdSql.Data.FirebirdClient.FbCommand.ExecuteReaderAsync(CommandBehavior behavior, CancellationToken cancellationToken)
at FirebirdSql.Data.FirebirdClient.FbCommand.ExecuteDbDataReaderAsync(CommandBehavior behavior, CancellationToken cancellationToken)
at Microsoft.EntityFrameworkCore.Storage.RelationalCommand.ExecuteReaderAsync(RelationalCommandParameterObject parameterObject, CancellationToken cancellationToken)
at Microsoft.EntityFrameworkCore.Storage.RelationalCommand.ExecuteReaderAsync(RelationalCommandParameterObject parameterObject, CancellationToken cancellationToken)
at Microsoft.EntityFrameworkCore.Query.Internal.SingleQueryingEnumerable`1.AsyncEnumerator.InitializeReaderAsync(AsyncEnumerator enumerator, CancellationToken cancellationToken)
at Microsoft.EntityFrameworkCore.Query.Internal.SingleQueryingEnumerable`1.AsyncEnumerator.MoveNextAsync()
and
System.IndexOutOfRangeException: Index was outside the bounds of the array.
at FirebirdSql.Data.Client.Managed.Version10.GdsStatement.ParseTruncSqlInfoAsync(Byte[] info, Byte[] items, Descriptor[] rowDescs, CancellationToken cancellationToken)
at FirebirdSql.Data.Client.Managed.Version10.GdsStatement.ProcessPrepareResponseAsync(GenericResponse response, CancellationToken cancellationToken)
at FirebirdSql.Data.Client.Managed.Version11.GdsStatement.PrepareAsync(String commandText, CancellationToken cancellationToken)
at FirebirdSql.Data.Client.Managed.Version11.GdsStatement.PrepareAsync(String commandText, CancellationToken cancellationToken)
at FirebirdSql.Data.FirebirdClient.FbCommand.PrepareAsync(Boolean returnsSet, CancellationToken cancellationToken)
at FirebirdSql.Data.FirebirdClient.FbCommand.PrepareAsync(Boolean returnsSet, CancellationToken cancellationToken)
at FirebirdSql.Data.FirebirdClient.FbCommand.ExecuteCommandAsync(CommandBehavior behavior, Boolean returnsSet, CancellationToken cancellationToken)
at FirebirdSql.Data.FirebirdClient.FbCommand.ExecuteReaderAsync(CommandBehavior behavior, CancellationToken cancellationToken)
at FirebirdSql.Data.FirebirdClient.FbCommand.ExecuteReaderAsync(CommandBehavior behavior, CancellationToken cancellationToken)
at FirebirdSql.Data.FirebirdClient.FbCommand.ExecuteDbDataReaderAsync(CommandBehavior behavior, CancellationToken cancellationToken)
at Microsoft.EntityFrameworkCore.Storage.RelationalCommand.ExecuteReaderAsync(RelationalCommandParameterObject parameterObject, CancellationToken cancellationToken)
at Microsoft.EntityFrameworkCore.Storage.RelationalCommand.ExecuteReaderAsync(RelationalCommandParameterObject parameterObject, CancellationToken cancellationToken)
at Microsoft.EntityFrameworkCore.Query.Internal.SingleQueryingEnumerable`1.AsyncEnumerator.InitializeReaderAsync(AsyncEnumerator enumerator, CancellationToken cancellationToken)
at Microsoft.EntityFrameworkCore.Query.Internal.SingleQueryingEnumerable`1.AsyncEnumerator.MoveNextAsync()
The problem occurs randomly but every time seems to be fatal. The only recovery is restarting the service. Removing antivirus and other endpoint security software, as suggested in this comment reduced the incidents but they still occur in some cases. The error sometimes happens during low traffic but is more common under heavy load.
We do not handle connections or transactions directly. PHP and C++ services run on the same server and connect to the Firebird database without any issues.
We created a minimal solution to reproduce the issue. It uses a proxy between the main process and Firebird that randomly corrupts packets during a set period, causing exceptions related to invalid operations, transaction handles, and others. Like our real project, it uses Entity Framework to manage connections and transactions. The proxy generates more varied exceptions, likely because it simulates the issue aggressively.
We know it is normal that issues occur while the proxy corrupts connections, but with PostgreSQL, the service recovers when the proxy stops corrupting traffic. With Firebird, most threads freeze. If the proxy corrupts packets longer, all threads become unresponsive.
Is it expected that the service does not recover?
We encounter random errors in our project that are hard to reproduce but occur in production. These errors resemble those in #963, #965, and others like
and
The problem occurs randomly but every time seems to be fatal. The only recovery is restarting the service. Removing antivirus and other endpoint security software, as suggested in this comment reduced the incidents but they still occur in some cases. The error sometimes happens during low traffic but is more common under heavy load.
We do not handle connections or transactions directly. PHP and C++ services run on the same server and connect to the Firebird database without any issues.
We created a minimal solution to reproduce the issue. It uses a proxy between the main process and Firebird that randomly corrupts packets during a set period, causing exceptions related to invalid operations, transaction handles, and others. Like our real project, it uses Entity Framework to manage connections and transactions. The proxy generates more varied exceptions, likely because it simulates the issue aggressively.
We know it is normal that issues occur while the proxy corrupts connections, but with PostgreSQL, the service recovers when the proxy stops corrupting traffic. With Firebird, most threads freeze. If the proxy corrupts packets longer, all threads become unresponsive.
Is it expected that the service does not recover?