Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Java Loggers #939

Merged
merged 80 commits into from
Oct 31, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
80 commits
Select commit Hold shift + click to select a range
e2353fe
First almost direct port of resolver, dataschema and ColumnSchema
Aug 23, 2022
37afe48
Add a factory for the StandardMetrics
Aug 26, 2022
afa2b76
Add data typing to match types to fractional, integer, and string
Aug 26, 2022
d5fab50
ColumnSchema and DatasetSchema with tests
Aug 26, 2022
0d71bdf
Moves file
Aug 26, 2022
65340f6
Code and testing for StandardMetric (note this is a factory were we w…
Aug 26, 2022
1eea1e7
Reolver, StandardResolver, and tests
Aug 26, 2022
0164e93
Java Linter
Aug 26, 2022
4e9cf91
Adding in privacy and an lombrok constructor instead
Aug 30, 2022
f7dc0a9
Changes based off review
Aug 30, 2022
96f0232
Refactor to not use the StandardMetric.java partial factory. Allow re…
Sep 1, 2022
1bb305e
Updates data schema to show dirty if there is a change to the columns…
Sep 1, 2022
a6820bb
Merge branch 'dev/melly/java/resolver' into dev/melly/java/profile-an…
Sep 1, 2022
b1cce16
Adding a getter and the Singl
Sep 6, 2022
1e94182
Most of the columnProfile is here. There is some needed changes to me…
Sep 6, 2022
7a0221c
WIP with the protobuf
Sep 8, 2022
fba212c
Removes protobuf as I don't think I need it in java
Sep 8, 2022
5f9e810
Finished first take of the dataset and column profile/view. Still err…
Sep 8, 2022
8d34448
Refactor Metric to be able to merge on main. It's not my favorite des…
Sep 8, 2022
6cbf871
Remove all protobuf and write (it's on the protobuf branch now). Upda…
Sep 12, 2022
7e61b1e
Tests for simple utiltiies
Sep 12, 2022
aea9c31
tests for ColumnProfileView
Sep 12, 2022
9e41255
tests for DatasetProfileView
Sep 12, 2022
3059ae5
place holder for TestColumnProfile
Sep 13, 2022
c6e04e3
Tests for ColumnProfile
Sep 14, 2022
3144530
skeleton
Sep 22, 2022
169e904
Holding place - figuring out why the view isn't visible
Sep 22, 2022
a44b1d5
Merge branch 'mainline' of github.com:whylabs/whylogs into dev/melly/…
Sep 22, 2022
68930a0
Merge branch 'dev/melly/java/profile-and-view' into dev/melly/java/lo…
Sep 22, 2022
4267f7c
removes unused import and does spotless Java
Sep 22, 2022
85c3575
fixes a test
Sep 22, 2022
808c5a3
Unused import
Sep 22, 2022
b2d5716
Implements resultsSet and does spotless
Sep 22, 2022
5989892
Testing result set and cleaning up
Sep 23, 2022
fbdc45f
Removes un needed method
Sep 23, 2022
9c725bf
rename variables timestampe to timestamp
Sep 23, 2022
d823006
refactoring name from cachedSize to cacheSize
Sep 23, 2022
e7daf77
Add copy to max components
Sep 23, 2022
acf7a78
Add no metric config zero for IntegralMetric
Sep 23, 2022
d7ee647
change init to do a shallow copy
Sep 23, 2022
3a02a33
Update tests
Sep 23, 2022
c0d2dc7
Moves all views returns to be unmodifiable
Sep 23, 2022
62f62b2
fix init bug in ColumnProfile and java spotless
Sep 23, 2022
41db21c
Moving row from Map<String, T> to Map<String, Object> this was due to…
Sep 23, 2022
4b48821
linter
Sep 23, 2022
0cc8949
Change to a shallow copy of metrics
Sep 26, 2022
e08bba1
Cleaning up metric merge
Sep 26, 2022
2b4ca16
Change from Date to Instance
Sep 26, 2022
2df442b
Adds null count to OperationResult, adds testing. Discovered bug (tod…
Sep 26, 2022
4422097
Fixes issue with metric tracking, a couple sp fixes
Oct 7, 2022
4d42803
Linter
Oct 7, 2022
0e9ede1
Adds autoclosable on columnprofile and linter
Oct 7, 2022
0ea4029
Fixes tests and spotlessapply
Oct 7, 2022
66c35bd
Changing metric to change the Metric to be the start of the CRTP inst…
Oct 7, 2022
71e7512
Merge branch 'dev/melly/java/profile-and-view' into dev/melly/java/re…
Oct 11, 2022
4c4ff50
Finishes merge, linted, and finished the tests
Oct 11, 2022
163383e
Merge branch 'mainline' into dev/melly/java/resultset
Oct 13, 2022
24aff91
Getting skeletons for Writers and writable in. Will need to think thr…
Oct 17, 2022
86e816e
Adding skeleton writers to logger
Oct 17, 2022
7d0423f
Logs a hashmap. Python has a generic object, we need to implement thi…
Oct 17, 2022
d5c2d40
Adds TransientLogger
Oct 17, 2022
d3e48e4
TimedRolledLogger started and Scheduler worked on. Still needs the ti…
Oct 18, 2022
e0cd6bd
iteration of scheduler - adds in scheduledExecurtor
Oct 19, 2022
10bb8b8
basis for the TimedRollingLogger, need to think though the flush and …
Oct 19, 2022
766dfe2
Merge branch 'mainline' into dev/melly/java/loggers
Oct 26, 2022
ec27301
Finishes TransientLogger after testing and debugging. TimedRollingLog…
Oct 27, 2022
f36a1d3
Renames folder to match google style guide and linter
Oct 27, 2022
18e4365
Merge branch 'mainline' into dev/melly/java/loggers
Oct 27, 2022
ad59752
Updates resultset
Oct 27, 2022
3f413cd
fixes package error on result set
Oct 27, 2022
865fa41
removing duplicate foler
Oct 27, 2022
e2057e2
Accidentally overroad ResultSet changes. Fixes that.
Oct 27, 2022
6271308
linter
Oct 27, 2022
83cc077
linter
Oct 27, 2022
aa94e1b
Merge branch 'dev/melly/java/loggers' of github.com:whylabs/whylogs i…
Oct 28, 2022
ac00bb0
Error and comments adds
Oct 28, 2022
2370f39
linter
Oct 28, 2022
15c56e0
linter
Oct 28, 2022
cf51f6a
Merge branch 'dev/melly/java/loggers' of github.com:whylabs/whylogs i…
Oct 28, 2022
9d0316e
linter
Oct 28, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
84 changes: 83 additions & 1 deletion java/core/src/main/java/com/whylogs/api/logger/Logger.java
Original file line number Diff line number Diff line change
@@ -1,3 +1,85 @@
package com.whylogs.api.logger;

public class Logger {}
import com.whylogs.api.logger.resultsets.ProfileResultSet;
import com.whylogs.api.logger.resultsets.ResultSet;
import com.whylogs.api.writer.Writer;
import com.whylogs.api.writer.WritersRegistry;
import com.whylogs.core.DatasetProfile;
import com.whylogs.core.schemas.DatasetSchema;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.Map;
import lombok.EqualsAndHashCode;
import lombok.Getter;
import lombok.NoArgsConstructor;
import lombok.ToString;

@NoArgsConstructor
@Getter
@EqualsAndHashCode
@ToString
public abstract class Logger implements AutoCloseable {
private boolean isClosed = false;
private DatasetSchema schema = new DatasetSchema();
private ArrayList<Writer> writers = new ArrayList<>();

public Logger(DatasetSchema schema) {
this.schema = schema;
}

public <T extends Writer> void checkWriter(T Writer) {
// Checks if a writer is configured correctly for this class
// Question: why is this empty but not an abstract?
}

public void appendWriter(String name) {
if (name == null || name.isEmpty()) {
throw new IllegalArgumentException("Writer name cannot be empty");
}

Writer writer = WritersRegistry.get(name);
if (writer == null) {
throw new IllegalArgumentException("Writer " + name + " is not registered");
}

appendWriter(writer);
}

public void appendWriter(Writer writer) {
if (writer == null) {
throw new IllegalArgumentException("Writer cannot be null");
}

checkWriter(writer);
writers.add(writer);
}

protected abstract ArrayList<DatasetProfile> getMatchingProfiles(Object data);

protected abstract <O> ArrayList<DatasetProfile> getMatchingProfiles(Map<String, O> data);

@Override
public void close() {
isClosed = true;
}

public ResultSet log(HashMap<String, Object> data) {
// What type of data is the object? Right now we don't process that in track.
if (isClosed) {
throw new IllegalStateException("Logger is closed");
} else if (data == null) {
throw new IllegalArgumentException("Data cannot be null");
}

// TODO: implement segment processing here

ArrayList<DatasetProfile> profiles = getMatchingProfiles(data);
for (DatasetProfile profile : profiles) {
profile.track(data);
}

// Question: Why does this only return the first profile? IS this
// getting ready for multiple profiles later on?
return new ProfileResultSet(profiles.get(0));
}
}
Original file line number Diff line number Diff line change
@@ -1,3 +1,32 @@
package com.whylogs.api.logger;

public class TransientLogger {}
import com.whylogs.core.DatasetProfile;
import com.whylogs.core.schemas.DatasetSchema;
import java.util.ArrayList;
import java.util.Map;
import lombok.*;

@NoArgsConstructor
@Getter
@EqualsAndHashCode(callSuper = false)
@ToString
public class TransientLogger extends Logger {
public TransientLogger(DatasetSchema schema) {
super(schema);
}

@Override
protected ArrayList<DatasetProfile> getMatchingProfiles(Object data) {
// In this case, we don't have any profiles to match against
ArrayList<DatasetProfile> profiles = new ArrayList<>();
DatasetProfile profile = new DatasetProfile(getSchema());
profiles.add(profile);
return profiles;
}

@Override
protected <O> ArrayList<DatasetProfile> getMatchingProfiles(Map<String, O> data) {
// In this case, we don't have any profiles to match against
return getMatchingProfiles((Object) data);
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
public class ProfileResultSet extends ResultSet {
@NonNull private final DatasetProfile profile;

public ProfileResultSet(DatasetProfile profile) {
public ProfileResultSet(@NonNull DatasetProfile profile) {
super();
this.profile = profile;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,6 @@ public abstract class ResultSet {
public abstract Optional<DatasetProfile> profile();

// TODO: Come back for ModelPerformanceMetrics

public void addMetric(String name, Metric<?> metric) throws Error {
DatasetProfile profile =
this.profile()
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
package com.whylogs.api.logger.rollingLogger;

import java.util.concurrent.Executors;
import java.util.concurrent.ScheduledExecutorService;
import java.util.concurrent.TimeUnit;
import lombok.EqualsAndHashCode;
import lombok.Getter;
import lombok.ToString;

@Getter
@EqualsAndHashCode
@ToString
public class Scheduler {
// Multithreading schedule.
// Schedule a function to be called repeatedly based on a schedule
private ScheduledExecutorService scheduledService;
private float initial;
private boolean ranInitial = false;
private float interval;
private Runnable func;
private boolean isRunning = false;
private String[] args;
// TODO: figure out args an dkwards

public Scheduler(float initial, float interval, Runnable func, String[] args) {
this.initial = initial;
this.interval = interval;
this.func = func;
this.args = args;
this.start();
}

private void run() {
// TODO: Looking at this I think this is wrong to have lines 35 & 36
this.isRunning = false;
this.start(); // Question: why do we need to start again?
this.func.run(); // TODO: figure out args and kwargs
}

public void start() {
if (this.isRunning) {
return;
}

float initial = 0;
if (!this.ranInitial) {
initial = this.getInitial();
this.ranInitial = true;
}

this.scheduledService = Executors.newSingleThreadScheduledExecutor();
this.scheduledService.scheduleAtFixedRate(
this::run, (long) initial, (long) this.interval, TimeUnit.SECONDS);
this.isRunning = true;
}

public void stop() {
this.scheduledService.shutdown();
this.isRunning = false;
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,173 @@
package com.whylogs.api.logger.rollingLogger;

import com.whylogs.api.logger.Logger;
import com.whylogs.api.writer.Writer;
import com.whylogs.core.DatasetProfile;
import com.whylogs.core.schemas.DatasetSchema;
import java.time.Instant;
import java.util.ArrayList;
import java.util.Map;
import java.util.concurrent.Callable;

public class TimedRollingLogger extends Logger implements AutoCloseable {
// A rolling logger that continuously rotates files based on time
private DatasetSchema schema;
private String baseName;
private String fileExtension;
private int interval;
private Character when = 'H'; // TODO: Make the Literals of S M H D
private boolean utc = false;
private boolean align = true;
private boolean skipEmpty = false;
private String suffix;

private DatasetProfile currentProfile;
private Callable<Writer> callback; // TODO: this isn't the write signature
private Scheduler scheduler;
private int currentBatchTimestamp;

// TODO: callback: Optional[Callable[[Writer, DatasetProfileView, str], None]]
public TimedRollingLogger(
DatasetSchema schema, String baseName, String fileExtension, int interval) {
this(schema, baseName, fileExtension, interval, 'H', false, true, false);
}

public TimedRollingLogger(
DatasetSchema schema, String baseName, String fileExtension, int interval, Character when) {
this(schema, baseName, fileExtension, interval, when, false, true, false);
}

public TimedRollingLogger(
DatasetSchema schema,
String baseName,
String fileExtension,
int interval,
Character when,
boolean utc,
boolean align,
boolean skipEmpty) {
super(schema);

this.schema = schema;
this.baseName = baseName;
this.fileExtension = fileExtension;
this.interval = interval;
this.when = Character.toUpperCase(when);
this.utc = utc;
this.align = align;
this.skipEmpty = skipEmpty;

if (this.baseName == null || this.baseName.isEmpty()) {
this.baseName = "profile";
}
if (this.fileExtension == null || this.fileExtension.isEmpty()) {
this.fileExtension = ".bin"; // TODO: should we make this .whylogs?
}

switch (this.when) {
case 'S':
this.interval = 1; // one second
this.suffix = "%Y-%m-%d_%H-%M-%S";
break;
case 'M':
this.interval = 60; // one minute
this.suffix = "%Y-%m-%d_%H-%M";
break;
case 'H':
this.interval = 60 * 60; // one hour
this.suffix = "%Y-%m-%d_%H";
break;
case 'D':
this.interval = 60 * 60 * 24; // one day
this.suffix = "%Y-%m-%d";
break;
default:
throw new IllegalArgumentException(
"Invalid value for when: " + this.when + ". Must be S, M, H, or D");
}

this.interval = this.interval * interval; // / multiply by units requested
this.utc = utc;

Instant currentTime = Instant.now();
this.currentBatchTimestamp = this.computeCurrentBatchTimestamp(currentTime.getEpochSecond());
this.currentProfile = new DatasetProfile(schema, currentTime, currentTime);
int initialRunAfter =
(this.currentBatchTimestamp + this.interval) - (int) currentTime.getEpochSecond();
if (initialRunAfter < 0) {
// TODO: Add logging error as this shouldn't happen
initialRunAfter = this.interval;
}

this.scheduler = new Scheduler(initialRunAfter, this.interval, this::doRollover, null);
this.scheduler.start();

// autocloseable closes at end
}

private int computeCurrentBatchTimestamp(long nowEpoch) {
int roundedNow = (int) nowEpoch; // rounds by going from an long to a int (truncates)
if (this.align) {
return (Math.floorDiv((roundedNow - 1), this.interval)) * this.interval + this.interval;
}
return roundedNow;
}

public void checkWriter(Writer writer) {
writer.check_interval(this.interval);
}

private ArrayList<DatasetProfile> getMatchingProfiles() {
ArrayList<DatasetProfile> matchingProfiles = new ArrayList<>();
matchingProfiles.add(this.currentProfile);
return matchingProfiles;
}

@Override
protected ArrayList<DatasetProfile> getMatchingProfiles(Object data) {
return this.getMatchingProfiles();
}

@Override
protected <O> ArrayList<DatasetProfile> getMatchingProfiles(Map<String, O> data) {
return this.getMatchingProfiles();
}

private void doRollover() {
if (this.isClosed()) {
return;
}

DatasetProfile oldProfile = this.currentProfile;
Instant currentTime = Instant.now();
this.currentBatchTimestamp = this.computeCurrentBatchTimestamp(currentTime.getEpochSecond());
this.currentProfile = new DatasetProfile(schema, currentTime, currentTime);

this.flush(oldProfile);
}

private void flush(DatasetProfile profile) {
if (profile == null) {
return;
} else if (this.skipEmpty && profile.isEmpty()) {
// set logger logger.debug("skip_empty is set. Skipping empty profiles")
return;
}

// get time to get name
String timedFileName = this.baseName + "_" + this.currentBatchTimestamp + this.fileExtension;

// Sleep while the profile is active?
// TODO: this is where we call the store list.write
// TODO: go through through the writers
}

public void close() {
// TODO log that we are closing the writer
if (!this.isClosed()) {
// Autoclose handles the isCLosed()
this.scheduler.stop();
this.flush(this.currentProfile);
}
}
}
22 changes: 22 additions & 0 deletions java/core/src/main/java/com/whylogs/api/writer/Writable.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
package com.whylogs.api.writer;

import java.io.File;
import java.io.FileWriter;

public interface Writable {

static FileWriter safeOpenWrite(String path) {
// Open 'path' for writing, creating any parent directories as needed
File file = new File(path);
FileWriter writer = null;
try {
writer = new FileWriter(file, true);
} catch (Exception e) {
System.out.println("Error: " + e);
e.printStackTrace();
}

// this close happens latter on
return writer;
}
}
Loading