Skip to content

fix: implement Symbol.hasInstance for cross-library instanceof checks#377

Open
Divyanshu-s13 wants to merge 5 commits intoapache:mainfrom
Divyanshu-s13:fix-instanceof-cross-library
Open

fix: implement Symbol.hasInstance for cross-library instanceof checks#377
Divyanshu-s13 wants to merge 5 commits intoapache:mainfrom
Divyanshu-s13:fix-instanceof-cross-library

Conversation

@Divyanshu-s13
Copy link
Contributor

What's Changed
Fixed the instanceof check issue that occurs when multiple versions or instances of the Arrow library are loaded in the same application. Previously, checks like [value instanceof Schema] would fail if the value came from a different Arrow library instance (e.g., when a library like LanceDB uses a different Arrow version than the user's code).

Now instanceof works reliably across different Arrow library instances by using global symbols for type identification.

Also added helper functions like [isArrowSchema()],[isArrowTable()], etc. for explicit type checking.

Closes #61.

@Divyanshu-s13
Copy link
Contributor Author

@kou I’ve fixed the issue. Could you please merge it?

@kou
Copy link
Member

kou commented Feb 6, 2026

Could you add tests for this?

@kou kou changed the title fix: implement Symbol.hasInstance for cross-library instanceof checks#61 fix: implement Symbol.hasInstance for cross-library instanceof checks Feb 6, 2026
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request implements cross-library instanceof checks for Apache Arrow's core types by using Symbol.for() based markers and Symbol.hasInstance. The change addresses a long-standing issue where instanceof checks would fail when multiple versions or instances of the Arrow library were loaded in the same application, particularly affecting integrations with libraries like LanceDB.

Changes:

  • Implemented Symbol.hasInstance and static is* methods for Schema, Field, DataType, Data, Vector, RecordBatch, and Table classes to enable reliable instanceof checks across different Arrow library instances
  • Added new utility file with helper functions (isArrowSchema, isArrowTable, etc.) for explicit type checking
  • Properly exported new functionality through the main Arrow.ts entry point

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/schema.ts Added Symbol.hasInstance and isSchema/isField methods using global symbols for cross-instance type checking
src/type.ts Added Symbol.hasInstance and isDataType method for DataType class
src/data.ts Added Symbol.hasInstance and isData method for Data class
src/vector.ts Added Symbol.hasInstance and isVector method for Vector class
src/recordbatch.ts Added Symbol.hasInstance and isRecordBatch method for RecordBatch class
src/table.ts Added Symbol.hasInstance and isTable method for Table class
src/util/typecheck.ts New utility file providing exported helper functions for type checking that work across library instances
src/Arrow.ts Added exports for new type checking helper functions and integrated them into the util namespace

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +28 to +42
/**
* Check if an object is an instance of Schema.
* This works across different instances of the Arrow library.
*/
static isSchema(x: any): x is Schema {
return x?.[kSchemaSymbol] === true;
}

/**
* Custom instanceof handler to work across different Arrow library instances.
* @see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Symbol/hasInstance
*/
static [Symbol.hasInstance](x: any): x is Schema {
return Schema.isSchema(x);
}
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new type checking functionality (Symbol.hasInstance and static is* methods) lacks test coverage. Consider adding tests that:

  1. Verify instanceof works across different library instances using Symbol.hasInstance
  2. Test the static is* methods (isSchema, isField, isDataType, etc.)
  3. Test the exported helper functions (isArrowSchema, isArrowField, etc.)
  4. Verify the behavior when objects don't have the marker symbol

These tests would ensure the cross-library instanceof functionality works as intended and doesn't regress in future changes.

Copilot uses AI. Check for mistakes.
@Divyanshu-s13
Copy link
Contributor Author

Could you add tests for this?

I have added a test for this.

Copy link
Member

@domoritz domoritz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice.

  • I wonder whether we can do some kind of check whether a second instance of arrow exists. We could show a warning. Otherwise I'd worry that we will at some point run into issues where people have multiple versions of Arrow in their systems for a long time.
  • What's the performance impact of these changes? This needs to be addressed before we can merge the changes.

}

/** @ignore */
const kDataSymbol = Symbol.for('apache-arrow/Data');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good solution. We just need to make sure to not accidentally use Symbol() instead of Symbol.for since the former does not create unique instances.

Copy link
Member

@domoritz domoritz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See request for a perf analysis. We have benchmarks in this repo in https://github.com/apache/arrow-js/tree/main/perf.

@Divyanshu-s13
Copy link
Contributor Author

Very nice.

  • I wonder whether we can do some kind of check whether a second instance of arrow exists. We could show a warning. Otherwise I'd worry that we will at some point run into issues where people have multiple versions of Arrow in their systems for a long time.
  • What's the performance impact of these changes? This needs to be addressed before we can merge the changes.

Great suggestions!

  1. Multiple instance detection: I can add a version check that warns users when multiple Arrow versions are loaded. Will implement this.

  2. Performance impact: The [Symbol.hasInstance] check is a simple property lookup ([x?.[kSymbol] === true] which should be O(1). I'll run benchmarks to measure:

instanceof with vs without Symbol.hasInstance
Impact on hot paths like Table iteration

@Divyanshu-s13
Copy link
Contributor Author

@domoritz Could you please let me know if any further changes are needed, or if this is ready to merge?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[JS] Fix instanceof or move away from instanceof within arrow-js

3 participants