```
https://data.genhub.co/datasets/crispor.json
```

Properties:

`x`

- Shape of `[26052, 92]`

`y`

- Shape of `[26052]`

Contains 20 encoded pairs under `x`

property and scores under `y`

property. For encoding, a one-hot encoding algorithm is used, where each letter is converted to 4 item array:

`A`

is `[1, 0, 0, 0]`

`T`

is `[0, 1, 0, 0]`

`G`

is `[0, 0, 1, 0]`

`C`

is `[0, 0, 0, 1]`

Encoded letters are then concatenated e.g `AG`

would result in `[1, 0, 0, 0, 0, 0, 1, 0]`

. Then bitwise `OR`

is applied between two sequences e.g.:

`AG`

or `AC`

would result in `[1, 0, 0, 0, 0, 0, 1, 1]`

```
https://data.genhub.co/datasets/mismatch_scores.json
```

Properties:

`r${letter1}:d${letter1},${position}`

where:

`letter1`

- Nucleotide of target sequence (`A`

or`T`

or`G`

or`C`

)`letter2`

- Nucleotide of gRNA`position`

- Position of the letters in the sequence

with value of score for that particular pair.

Example: `rA:dA,11`

: `0.307692308`

```
https://data.genhub.co/datasets/pam_scores.json
```

Properties:

`NN`

where `N`

is any nucleotide, with value of score for that pam sequence.

Example: `CG`

: `0.107142857`