Skip to content
Open
Show file tree
Hide file tree
Changes from 41 commits
Commits
Show all changes
54 commits
Select commit Hold shift + click to select a range
8335f11
new shiny create_model code that is more memory efficient
spreeker Oct 28, 2020
69f6ade
enable mistaken disabled code
spreeker Oct 28, 2020
d7bdcab
update readme
spreeker Oct 28, 2020
a34870a
add column validation
spreeker Nov 2, 2020
841bf49
reduce count was not registered fixed in template
spreeker Nov 12, 2020
aaa23b0
fix validation
spreeker Nov 19, 2020
a1bd933
start work on geo s2 stuff
spreeker Nov 19, 2020
622f46e
wip s2 test
spreeker Nov 27, 2020
9512c74
first working geo-selection
spreeker Dec 3, 2020
613bdff
working geo-selection
spreeker Dec 3, 2020
8ab6bb8
add curl test case for geo query
spreeker Dec 3, 2020
93f801d
wip test with labels
spreeker Dec 7, 2020
63d3429
geosearch works
spreeker Dec 8, 2020
c370cef
improve geojson handling
spreeker Dec 9, 2020
1562973
show http table
spreeker Dec 14, 2020
5059fef
keep lambda working for data without geometry
spreeker Jan 12, 2021
2ec0762
create model now has ingore and geocolumn options
spreeker Jan 14, 2021
b5c3a2f
use gzipped csv
spreeker Oct 8, 2020
0cc33c6
handle review remarks
spreeker Oct 27, 2020
507f79b
add groupby / reduce to query and validate parameters
spreeker Jan 19, 2021
7b83294
groupby cache experiment
spreeker Jan 19, 2021
8a59246
start factoring bitarray into templateable code
spreeker Jan 26, 2021
6b7be31
fix bugs using multiple bitarray keys
spreeker Jan 26, 2021
0d224b9
fix cache headers
spreeker Jan 26, 2021
b4be4ae
update model template
spreeker Jan 26, 2021
1fbfaf5
start with bitarray templateing
spreeker Jan 27, 2021
197e3dc
bit array model code generation is working now
spreeker Feb 3, 2021
fda371f
add custom groupby
spreeker Feb 4, 2021
0d525a5
add missing return
spreeker Feb 4, 2021
b893f60
validated and fixed reason for missing schools
spreeker Feb 10, 2021
280b26a
add woning equivalent reduce
spreeker Feb 15, 2021
f6e924f
try pgzip
spreeker Feb 15, 2021
ba495a2
added bouwjaar
spreeker Feb 17, 2021
fff9bcb
added readlock, moved custom code
spreeker Feb 17, 2021
aac4b78
update readme
spreeker Mar 9, 2021
041b486
fix bug returning raw item json
spreeker Mar 9, 2021
1c9f9d1
allow reduce without groupby
spreeker Mar 10, 2021
05a3627
add header column to csv
spreeker Mar 15, 2021
c2ff2d7
working merged build. labeledItems renamed to Items
spreeker Mar 22, 2021
d4a1ad6
remove merge mistake
spreeker Mar 22, 2021
ff690a7
remove merge mistake
spreeker Mar 22, 2021
7faf26b
wip working new storage / retrieve methods
spreeker Apr 19, 2021
a7fc206
first working tests
spreeker Apr 20, 2021
e74c533
first geo testing wip
spreeker Apr 20, 2021
c87ba16
working geojson tests, removed some code duplication
spreeker Apr 21, 2021
76c0d6b
add storage test, create example requests
spreeker Apr 21, 2021
cc247ce
wip: rewrite model creation, code, added column.go code
spreeker Apr 26, 2021
4a306f6
wip: fix model creation after rewrite
spreeker Apr 26, 2021
e6f1e4f
done: code generation now works correctly, added new model and model_…
spreeker Apr 27, 2021
5733650
fix: colom test
spreeker Apr 27, 2021
b51468a
docs: column.go
spreeker Apr 27, 2021
f8cb26e
docs: model.go creation
spreeker Apr 27, 2021
29daf04
production version, improved error reporting about bitarray usage
spreeker May 4, 2021
bf9d65e
added huisnummer / toevoegingen many small fixes to code generation
spreeker May 10, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
*.csv
*.csv2
.git
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
extras/model.go
3 changes: 2 additions & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ RUN apk update && apk add --no-cache git
RUN apk --no-cache add ca-certificates

WORKDIR /app
COPY . /app/
COPY *.go /app/

# Fetch dependencies.
RUN go get -d -v
Expand All @@ -23,6 +23,7 @@ COPY --from=builder /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/
#COPY --from=builder /app/files/ITEMS.txt.gz /app/files/ITEMS.txt.gz

WORKDIR /app

# Run the binary.

ENV http_db_host "0.0.0.0:8000"
Expand Down
58 changes: 52 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,32 +1,78 @@
# LambdaDB
In memory database that uses filters to get the data you need.
Lambda DB has a tiny codebase which does a lot
Lambda is not ment as a persistance storage or a replacement for a traditional
Database but as fast analytics engine cache representation engine.

Can be used for your needs by changing the models.go file to your needs.
powers: https://dego.vng.nl

## Properties:

- Insanely fast API. 1ms respsonses
- Fast to setup.
- Easy to deploy.
- Easy to customize.
- Easy export data

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great to to read!!!

- Implement custom authorized filters.

## Indexes

- S2 geoindex for fast point lookup
- Bitarrays
- Mapping

- Your own special needs indexes!

## Flow:

Generate a model and load your data.
The API is generated from your model.
Deploy.

Condition: Your dataset must fit in memory.

Can be used for your needs by changing the `models.go` file to your needs.
Creating and registering of the functionality that is needed.


### Steps
You can start the database with only a csv.
Go over steps below, And see the result in your browser.

1. place csv file, in dir extras.
2. `python3 create_model.py > ../model.go`
3. cd ../
4. go fmt
2. `python3 create_model_.py` answer the questions.
3. go fmt model.go
4. mv model.go ../
5. go build
6. ./lambda --help
7. ./lambda --csv assets/items.csv or `python3 ingestion.py -b 1000`
9. curl 127.0.0.1:8128/help/
10. browser 127.0.0.1:8128/


11. instructions curl 127.0.0.1:8128/help/ | python -m json.tool



### Running

sudo docker-compose up --no-deps --build

promql {instance="lambdadb:8000"}

python3 extras/ingestion.py -f movies_subset.tsv -format tsv -dbhost 127.0.0.1:8000
=======

1. instructions curl 127.0.0.1:8000/help/ | python -m json.tool

### Questions



### TODO

- load data directly from a database (periodic)
- document the `create_model.py` questions
- use a remote source for CSV
- use some compression faster to load than gzip
- generate swagger API
- Add more tests
24 changes: 12 additions & 12 deletions csv.go
Original file line number Diff line number Diff line change
@@ -1,18 +1,17 @@
package main

import (
"compress/gzip"
"encoding/json"
"errors"
"fmt"
csv "github.com/JensRantil/go-csv"
"github.com/cheggaaa/pb"
"github.com/klauspost/pgzip"
"io"
"log"
"os"
"strings"
"unicode/utf8"

csv "github.com/JensRantil/go-csv"
"github.com/cheggaaa/pb"
)

func containsDelimiter(col string) bool {
Expand Down Expand Up @@ -59,11 +58,11 @@ func copyCSVRows(itemChan ItemsChannel, reader *csv.Reader, ignoreErrors bool,
success := 0
failed := 0

items := Items{}
items := ItemsIn{}

for {
item := Item{}
columns := item.Columns()
itemIn := ItemIn{}
columns := itemIn.Columns()
cols := make([]interface{}, len(columns))
record, err := reader.Read()

Expand Down Expand Up @@ -98,7 +97,7 @@ func copyCSVRows(itemChan ItemsChannel, reader *csv.Reader, ignoreErrors bool,
// marschall it to bytes
b, _ := json.Marshal(itemMap)
// fill the new Item instance with values
if err := json.Unmarshal([]byte(b), &item); err != nil {
if err := json.Unmarshal([]byte(b), &itemIn); err != nil {
line := strings.Join(record, delimiter)
failed++

Expand All @@ -113,14 +112,15 @@ func copyCSVRows(itemChan ItemsChannel, reader *csv.Reader, ignoreErrors bool,

if len(items) > 100000 {
itemChan <- items
items = Items{}
items = ItemsIn{}
}
items = append(items, &item)
items = append(items, &itemIn)
success++
}

// add leftover items
itemChan <- items
items = nil

return nil, success, failed
}
Expand All @@ -142,7 +142,7 @@ func importCSV(filename string, itemChan ItemsChannel,
defer file.Close()

bar = NewProgressBar(file)
fz, err := gzip.NewReader(io.TeeReader(file, bar))
fz, err := pgzip.NewReader(io.TeeReader(file, bar))

if err != nil {
return err
Expand Down Expand Up @@ -179,7 +179,7 @@ func importCSV(filename string, itemChan ItemsChannel,
return fmt.Errorf("line %d: %s", lineNumber, err)
}

fmt.Printf("%d rows imported", success)
fmt.Printf("%d rows imported\n", success)

if ignoreErrors && failed > 0 {
fmt.Printf("%d rows could not be imported and have been written to stderr.", failed)
Expand Down
22 changes: 22 additions & 0 deletions curlgeotest.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
#!/usr/bin/env bash

set -x
set -e
set -u


curl -vvv \
--data-urlencode 'geojson={
"type": "Polygon",
"coordinates": [
[
[4.902321, 52.428306],
[4.90127, 52.427024],
[4.905281, 52.426069],
[4.906782, 52.426226],
[4.906418, 52.427469],
[4.902321, 52.428306]
]
]
}' \
'http://127.0.0.1:8000/list/?groupby=postcode&reduce=count'
Copy link
Owner

@Attumm Attumm Apr 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this file is right now only relevant for one project it and not other projects, it can't be added.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well it s a first basic test that should be converted to a real test using real data. It took a long time to create like it is. Also serves as an example. so mark as NEEDS WORK or something.

11 changes: 11 additions & 0 deletions curltest.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
#!/usr/bin/env bash

set -x
set -e
set -u

# should be cached.
curl -vv 'http://127.0.0.1:8000/list/?groupby=woning_type&reduce=count'

# should not be cached.(using bitmaps)
curl -vv 'http://127.0.0.1:8000/list/?match-wijkcode=WK036394&groupby=woning_type&reduce=count'
Copy link
Owner

@Attumm Attumm Apr 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this file is right now only relevant for one project it and not other projects, it can't be added.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could add test dataset and convert them to proper tests.

30 changes: 30 additions & 0 deletions custom.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
package main

import (
"strconv"
)

type registerCustomGroupByFunc map[string]func(*Item, ItemsGroupedBy)

var RegisterGroupByCustom registerCustomGroupByFunc

func init() {

RegisterGroupByCustom = make(registerCustomGroupByFunc)
RegisterGroupByCustom["gebruiksdoelen-mixed"] = GroupByGettersGebruiksdoelen

}

func reduceWEQ(items Items) map[string]string {
result := make(map[string]string)
weq := 0
for i := range items {
_weq, err := strconv.ParseInt(items[i].Woningequivalent, 10, 64)
if err != nil {
panic(err)
}
weq += int(_weq)
}
result["woningenquivalent"] = strconv.Itoa(weq)
return result
}
Copy link
Owner

@Attumm Attumm Apr 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this file is right now only relevant for one project it and not other projects, it can't be added.

Loading