Implement Hadamard product for matrices with different memory layouts by Copilot · Pull Request #37 · Quafadas/vecxt

Copilot · 2025-10-02T11:16:23Z

Overview

This PR implements the Hadamard (element-wise) product for matrices with differing memory layouts, removing the previous limitation that required matrices to have identical dense layouts.

Problem

Previously, the hadamard operation would throw a NotImplementedError when given matrices with non-simple or different memory layouts (e.g., sliced views, transposed matrices, or mixed row-major/column-major). This limited the operation to only work with matrices that had identical, simple contiguous layouts.

val base = Matrix[Double](NArray.tabulate[Double](9)(_.toDouble), 3, 3)
val view = base(::, NArray(1, 2))  // Create a non-contiguous view
view.hadamard(view)  // Previously threw NotImplementedError

Solution

The implementation now handles matrices with different memory layouts by:

Fast path (unchanged): When both matrices have compatible dense element-wise memory layouts, use SIMD-optimized array multiplication directly
Different layouts: Intelligently materializes only one matrix to match the other's layout (row-major or column-major), then uses in-place multiplication
Cross-platform: Works on JVM (with SIMD), JS, and Native platforms

Changes

Matrix Operations

Modified vecxt/src/doublematrix.scala to handle different memory layouts
Hadamard now detects which matrix has a dense layout (row-major or column-major) and materializes the other to match
Uses existing array element-wise in-place multiplication (*=) for better performance

Enhanced deepCopy

Updated vecxt/src/MatrixInstance.scala with overloaded deepCopy method
Supports both column-major (default) and row-major output via deepCopy(asRowMajor: Boolean)
Maintains backward compatibility with parameterless version

New Array Operations

Added element-wise in-place array multiplication (*=) for all platforms:
- JVM (vecxt/src-jvm/arrays.scala): SIMD-optimized using DoubleVector
- JS (vecxt/src-js/array_native.scala): Efficient while loop
- Native (vecxt/src-native/array_js_native.scala): Efficient while loop

Tests

Removed test expecting NotImplementedError for non-simple layouts
Added comprehensive tests covering:
- Non-contiguous matrix views (column slices)
- Mixed layouts (simple and non-simple matrices)
- Transposed matrices (row-major vs column-major)

Testing

All tests pass on JVM and Native platforms:

✅ Existing hadamard tests continue to pass
✅ New tests verify correctness with different layouts
✅ No performance regression for the common case (same layouts)
✅ Row-major matrices now handled optimally

Performance Optimizations

Only deepCopy one matrix (whichever needs materialization)
Respects both row-major and column-major layouts, avoiding unnecessary conversions
Uses SIMD-optimized array *= operation instead of custom multiplication loops
Multiply in-place to avoid third allocation

This implementation follows the pattern established by other element-wise operations in the codebase (like +:+, -:-) and maintains the library's commitment to correctness as the #1 priority while leveraging SIMD optimizations where possible.

Fixes #36

Original prompt

This section details on the original issue you should resolve

<issue_title>Hadamard product with different memory layouts.</issue_title>
<issue_description>The goal is to implement the hadmard product of two matricies, which have differing memory layouts.

On other words, to remove this line which ensures, that the operation is untested.

vecxt/vecxt/test/src/notImpl.test.scala

Lines 25 to 26 in b0b5fbb

test("DoubleMatrix.hadamard with non-simple layout throws") {

intercept[NotImplementedError](m.hadamard(m))

An initial implementation would need to be here. For native and JS, we should be able to use the "obvious" implementation.

vecxt/vecxt/src/doublematrix.scala

Lines 147 to 148 in b0b5fbb

else ???

end if

Add tests to ensure correctness.

On JVM, The implementation should materialise the "most strided" matrix into a dense memory format compatible with it's counterpart, then use java's SIMD to multiply the matricies together, in order to optimise performance. </issue_description>

Comments on the Issue (you are @copilot in this section)

Fixes #36

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Co-authored-by: Quafadas <24899792+Quafadas@users.noreply.github.com>

Quafadas · 2025-10-02T12:08:58Z

@copilot You should be able to improve performance here in two ways;

Curretly you deep copy the each matrix then make another array copy with the multiplication. Please attempt the following improvements;

Only one matrix needs to be "deepCopied".
We should then be able to multiply that (single deepcopied) matrix / vector "in place", and return that.

Co-authored-by: Quafadas <24899792+Quafadas@users.noreply.github.com>

Copilot · 2025-10-02T12:17:04Z

@copilot You should be able to improve performance here in two ways;

Curretly you deep copy the each matrix then make another array copy with the multiplication. Please attempt the following impro...

Optimized in commit 9bdee4d:

✅ Only one matrix is deepCopied now - the implementation checks which matrix is already dense column-major and only materializes the other one
✅ Multiply in-place into the materialized matrix's raw array, avoiding the third allocation

The optimization uses element-wise in-place multiplication when one matrix is already dense, and falls back to SIMD array multiplication only when both matrices need materialization (worst case).

All tests pass on JVM and Native platforms.

Quafadas · 2025-10-02T12:27:26Z

@copilot we're getting there, but I can see that the case where the matricies are "rowMajor" could also be handled in a more optimal way. Currently, "deepCopy" returns a Column Major matrix.

As a part of this, update the deepCopy method, to return either colMajor or rowMajor (introduce a flag).

Inside the hadamard method, instead of writing out your own array multiplication logic, if either matrix has a dense memory layout, deep copy the other one and then use the existing array "multiply in place" functionality.

Co-authored-by: Quafadas <24899792+Quafadas@users.noreply.github.com>

Copilot · 2025-10-02T12:39:46Z

@copilot we're getting there, but I can see that the case where the matricies are "rowMajor" could also be handled in a more optimal way. Currently, "deepCopy" returns a Column Major matrix.

As a ...

Implemented in commit 0b72a91:

✅ Updated deepCopy to support both column-major (default) and row-major layouts via an overloaded method
✅ Added element-wise in-place array multiplication (*=) for JVM, JS, and Native platforms
✅ Updated hadamard to detect which matrix has a dense layout (row-major or column-major), deepCopy the other one to match that layout, then use the array *= operation for in-place multiplication

The implementation now:

Handles row-major matrices more efficiently by materializing to row-major when appropriate
Uses the SIMD-optimized array *= operation instead of custom multiplication loops
Only creates one copy when at least one matrix has a dense layout

All tests pass on JVM and Native platforms.

Copilot

Pull Request Overview

Implements Hadamard (element-wise) product for matrices with differing memory layouts, removing the previous limitation to identical dense layouts.

Adds in-place array multiplication (*=) across platforms and uses it to avoid unnecessary materializations where possible
Extends Matrix.deepCopy to support row-major or column-major targets
Updates hadamard to materialize minimally and handle mixed/simple/non-simple layouts

Reviewed Changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
vecxt/test/src/notImpl.test.scala	Removes the test expecting NotImplementedError for hadamard on non-simple layouts.
vecxt/test/src/matrix.test.scala	Adds tests covering non-simple views, mixed layouts, and transposed matrices for hadamard.
vecxt/src/doublematrix.scala	Implements hadamard across differing layouts with fast-path for compatible dense layouts and materialization logic otherwise.
vecxt/src/MatrixInstance.scala	Adds deepCopy(asRowMajor: Boolean) and updates default deepCopy to delegate.
vecxt/src-jvm/arrays.scala	Introduces SIMD-optimized in-place array multiplication (*=) for JVM.
vecxt/src-js/array_native.scala	Adds in-place array multiplication (*=) for Scala.js.
vecxt/src-native/array_js_native.scala	Adds in-place array multiplication (*=) for Scala Native.

Copilot · 2025-10-02T12:58:35Z

vecxt/src/doublematrix.scala

+          // Neither is dense, materialize both to column-major and use SIMD multiplication
+          val mDense = m.deepCopy(asRowMajor = false)
+          val m2Dense = m2.deepCopy(asRowMajor = false)
+          val newArr = vecxt.arrays.*(mDense.raw)(m2Dense.raw)
+          Matrix[Double](newArr, m.rows, m.cols)(using BoundsCheck.DoBoundsCheck.no)


This path allocates three arrays (two deep copies plus a new result from arrays.*). To avoid the extra allocation, multiply in-place into one of the deep copies and return it, e.g.:

val mDense = m.deepCopy(asRowMajor = false) val m2Dense = m2.deepCopy(asRowMajor = false) vecxt.arrays.*=(mDense.raw)(m2Dense.raw) mDense

Suggested change

// Neither is dense, materialize both to column-major and use SIMD multiplication

val mDense = m.deepCopy(asRowMajor = false)

val m2Dense = m2.deepCopy(asRowMajor = false)

val newArr = vecxt.arrays.*(mDense.raw)(m2Dense.raw)

Matrix[Double](newArr, m.rows, m.cols)(using BoundsCheck.DoBoundsCheck.no)

// Neither is dense, materialize m to column-major and multiply in-place

val mDense = m.deepCopy(asRowMajor = false)

vecxt.arrays.*=(mDense.raw)(m2.raw)

mDense

Copilot · 2025-10-02T12:58:35Z

vecxt/src/MatrixInstance.scala

+      val newMat =
+        if asRowMajor then Matrix(newRaw, m.rows, m.cols, m.cols, 1, 0) // row-major: rowStride = cols, colStride = 1
+        else Matrix(newRaw, m.rows, m.cols, 1, m.rows, 0) // column-major: rowStride = 1, colStride = rows
      var i = 0


The local counter i is never used except for an increment inside the loops; it can be removed to reduce noise and avoid confusion.

Initial plan

f801d1f

Copilot AI assigned Copilot and Quafadas Oct 2, 2025

Copilot started work on behalf of Quafadas October 2, 2025 11:16 View session

Copilot AI and others added 2 commits October 2, 2025 11:31

Implement hadamard product with different memory layouts

c5c40bf

Co-authored-by: Quafadas <24899792+Quafadas@users.noreply.github.com>

Remove temporary experiment file

f588c80

Co-authored-by: Quafadas <24899792+Quafadas@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Hadamard product with different memory layouts.~~ Implement Hadamard product for matrices with different memory layouts Oct 2, 2025

Copilot AI requested a review from Quafadas October 2, 2025 11:34

Copilot finished work on behalf of Quafadas October 2, 2025 11:34

Copilot started work on behalf of Quafadas October 2, 2025 12:09 View session

Optimize hadamard: only deepCopy one matrix and multiply in-place

9bdee4d

Co-authored-by: Quafadas <24899792+Quafadas@users.noreply.github.com>

Copilot finished work on behalf of Quafadas October 2, 2025 12:17

Copilot started work on behalf of Quafadas October 2, 2025 12:27 View session

Support row-major deepCopy and use array *= for hadamard

0b72a91

Co-authored-by: Quafadas <24899792+Quafadas@users.noreply.github.com>

Copilot finished work on behalf of Quafadas October 2, 2025 12:40

Quafadas marked this pull request as ready for review October 2, 2025 12:54

Copilot AI review requested due to automatic review settings October 2, 2025 12:54

Quafadas merged commit 6285354 into main Oct 2, 2025
12 of 18 checks passed

Copilot AI mentioned this pull request Oct 2, 2025

Add documentation for Hadamard product with different memory layouts (PR #37) #39

Draft

Copilot AI reviewed Oct 2, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implement Hadamard product for matrices with different memory layouts#37

Implement Hadamard product for matrices with different memory layouts#37
Quafadas merged 5 commits intomainfrom
copilot/fix-7ebc2283-e18a-45e5-b01d-9656d56639cf

Copilot AI commented Oct 2, 2025 •

edited

Loading

Uh oh!

Quafadas commented Oct 2, 2025

Uh oh!

Copilot AI commented Oct 2, 2025

Uh oh!

Quafadas commented Oct 2, 2025

Uh oh!

Copilot AI commented Oct 2, 2025

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Oct 2, 2025

Uh oh!

Copilot AI Oct 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	test("DoubleMatrix.hadamard with non-simple layout throws") {
	intercept[NotImplementedError](m.hadamard(m))

Uh oh!

Conversation

Copilot AI commented Oct 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Problem

Solution

Changes

Matrix Operations

Enhanced deepCopy

New Array Operations

Tests

Testing

Performance Optimizations

Comments on the Issue (you are @copilot in this section)

Uh oh!

Quafadas commented Oct 2, 2025

Uh oh!

Copilot AI commented Oct 2, 2025

Uh oh!

Quafadas commented Oct 2, 2025

Uh oh!

Copilot AI commented Oct 2, 2025

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Oct 2, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 2, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Oct 2, 2025 •

edited

Loading