Migrate AutoAugment and RandAugment to TensorFlow Addons. #1226

dynamicwebpaige · 2020-03-05T15:57:41Z

Describe the feature and the current behavior/state.
RandAugment and AutoAugment are both policies for enhanced image preprocessing that are included in EfficientNet, but are still using tf.contrib.

https://github.com/tensorflow/tpu/blob/master/models/official/efficientnet/autoaugment.py

The only tf.contrib image operations that they use, however, are rotate, translate and transform - all of which have been included in TensorFlow Addons.

Relevant information

Are you willing to contribute it (yes/no):
No, but am hoping that someone from the community will pick it up (potentially a Google Summer of Code student)?
Are you willing to maintain it going forward? (yes/no):
Yes
Is there a relevant academic paper? (if so, where):
AutoAugment Reference: https://arxiv.org/abs/1805.09501
RandAugment Reference: https://arxiv.org/abs/1909.13719
Is there already an implementation in another framework? (if so, where):
See link above; this would be a standard migration from tf.contrib.
Was it part of tf.contrib? (if so, where):
Yes

Which API type would this fall under (layer, metric, optimizer, etc.)
Image

Who will benefit with this feature?
Anyone doing image preprocessing, especially for EfficientNet.

The text was updated successfully, but these errors were encountered:

bhack · 2020-03-05T18:17:16Z

For the Gsoc probably we need to think more generally about preprocessing.
We have still numpy based operations in https://keras.io/preprocessing/image/.
Also It needs to be investigated if there will be any integration between AutoAugment/Autokeras.

stmugisha · 2020-03-05T18:25:41Z

For the Gsoc probably we need to think more generally about preprocessing.
We have still numpy based operations in https://keras.io/preprocessing/image/.

@bhack can you please elaborate more on what you mean by this. Thanks

bhack · 2020-03-05T18:43:07Z

Many images operations in Keras are still not in addons or in tf.image but numpy implemented:
https://github.com/keras-team/keras-preprocessing/tree/master/keras_preprocessing/image
Other image operations are going in TF.IO or TF.graphics?
I think that we need to unify image processing so that eventually ops could benefit from the compiler stack soon or later (MLIR & friends).

bhack · 2020-03-05T18:46:26Z

Also about AutoKeras can the policy be handled by Autokeras so that autoaugment could be used more generally in other projects/experiments with Autokeras+keras_preprocessing instead of being embedded in Efficientnet?

abhichou4 · 2020-03-05T21:41:11Z

Also about AutoKeras can the policy be handled by Autokeras so that autoaugment could be used more generally in other projects/experiments with Autokeras+keras_preprocessing instead of being embedded in Efficientnet?

I don't think AutoKeras handles policies as such. And if policies are just series of operations we could store them maybe as named tuples. A parser (for the lack of a better word) could then return a ImageAugmentation object using a policy. The methods of this class could use Keras preprocessing, tf.image etc. Seems messy. Is this better than porting already implemented image ops in tfa.image?

We could train models with various policies using this as proof of concept.

bhack · 2020-03-05T22:00:02Z

AutoKeras was "recently" already integrated with keras preprocessing https://github.com/keras-team/autokeras/pull/922/files.
I agree that probably we can investigate what is needed to expand over Autoaugment (or other kind of) policies.
But I think that in AutoAugmentation is also important to have ops that could benefit maximum performance (also in scheduling) cause you will not do these operations offline.

fsx950223 · 2020-03-06T01:46:06Z

I'm interested in it and I have benefited a lot from AutoAugment.

seanpmorgan · 2020-03-10T01:25:06Z

So while looking at how AutoKeras is handling hyperparameter containers, at first glance it seems like a suitable replacement for Hparams:
https://github.com/keras-team/keras-tuner/blob/master/kerastuner/engine/hyperparameters.py#L464

We initially decided not to move HParams when we left tf.contrib (even though its convient and works very well) since we didn't want to diverge from officially support APIs.

Adding KerasTuner as a dependency has its own challenges, but I'm wondering if there is a better way to align with the eco-system though using it.

@omalleyt12 @gabrieldemarmiesse Do you have any thoughts about use re-using KerasTuner's HyperParameter object (likely overkill for whats needed in AutoAugment). Or just general thoughts on how AutoAugment in addons fits with the KerasTuner and Keras Preprocessing advances?

bhack · 2020-03-11T19:08:49Z

We still have also AutoAugment variant policy in https://github.com/google-research/remixmatch

abhichou4 · 2020-03-11T19:46:58Z

Should these augmentations be added as part of a submodule in addons?

bhack · 2020-03-11T19:54:49Z

I don't know. Probably.. It had its own sub "library" for augmentation 😄 https://github.com/google-research/remixmatch/tree/master/libml
It is quite standard to find fragmentation under google-research repos.

bhack · 2020-03-11T20:06:20Z

/cc @carlini

carlini · 2020-03-11T20:20:06Z

The remixmatch repository is intended to faithfully reproduce the the experiments of the corresponding ICLR'20 paper. We do not intend for this repository to be the source of truth for any particular implementation.

gabrieldemarmiesse · 2020-03-16T10:18:46Z

@seanpmorgan I don't think that we should depend on Keras-Tuner, except when implementing papers where the end result is a hyperparameter search algorithm.

For papers that use a search algorithm to produce a result, like AutoAugment and RandAugment, we should hardcode the final numbers here and here and make sure our API/architecture is modular enough for other people to plug it into an hyperparameter search algorithm.

In short, let's make it easy for users to change those values, it's up to them to plug our API in keras-tuner if they want.

bhack · 2020-03-16T11:24:44Z

@gabrieldemarmiesse What do you think about Remixmatch AutoAugment variant that learns the policy on training?
https://github.com/google-research/remixmatch/blob/master/libml/augment.py

gabrieldemarmiesse · 2020-03-16T11:46:01Z

@bhack could you expand? I'm not sure I understand your question.

bhack · 2020-03-16T12:49:40Z

@gabrieldemarmiesse I meant how we could generally organize policies as I think that with just a first AutoAugment variant I think we would organize polices a little bit.

carlini · 2020-03-16T16:52:43Z

ReMixMatch's augment policy (CTA) is slightly different from standard augmentation policies because it requires to integrate it with the training loop. At every minibatch step, the policy needs to be "trained" with a second minibatch of examples so that it can determine the magnitude of perturbations that are allowed.

cc @david-berthelot

bhack · 2020-03-16T17:19:23Z

Yes we need to consider also https://github.com/google-research/fixmatch but it seems to me it doesn't introduce new augmentation policies.

bhack · 2020-03-17T17:38:08Z

AutoAugment is also (is it just a copy?) in today released EfficientDet https://github.com/google/automl/blob/master/efficientdet/aug/autoaugment.py.
We really need a place, like this one, to handle an official AutoAugment extendible API. 😄

carlini · 2020-03-18T05:38:18Z

Fixmatch uses the same augmentations as ReMixMatch.

bhack · 2020-03-18T10:46:13Z

Thank you for the FixMatch confirmation. Instead I see that @mingxingtan in EfficientDet added some extra AutoAugmentations

vinayvr11 · 2020-03-18T13:54:57Z

@dynamicwebpaige
Hello i want to work on this i am applying for this year GSOC. So did all of the operations are present in Tensorflow Addons. Or we need to create them from scratch.

bhack · 2020-03-18T14:33:32Z

@vinayvr11 There is an inital PR at #1275.
If you can start contribute some ops with PR before be evalauted for GSOC It is better.
Please open a ticket and mention @abhichou4 to not overlap with him before you start.

abhichou4 · 2020-03-18T15:52:58Z

I'll make an issue regarding this, listing all image ops to add. We can also discuss how they can be handled more generally in tfa.image.

vinayvr11 · 2020-03-20T19:16:15Z

Ok thank you @bhack.

vinayvr11 · 2020-03-21T07:25:51Z

Hello @bhack could you please help me to find out some more issues in Tensorflow Addons or tf.image() so that i can mention them in GSOC.
Thank you

bhack · 2020-03-21T12:41:03Z

@vinayvr11 these are general hints for every studen.
I suppose that all the image ops and bboxs ops the we have listed from the sparse repositoriees plus the design of an augmentation policy API to porting the policies of the relative repositories that we have mentioned could be enough for a proposal.
There are also some colorspace conversions that I mentioned if you want to expand the ops coverave and there are also some operations in Mediapipe calculators that could be interesting to see if could be covered in addon.
For all the applicants I suggested to study a little bit the policy papers that we have mentioned and the relative repositories to try to make a credibile estimation of the amount of work and figure out a timeline in the proposal.
Having a credibile roadmap in the proposal is a positive evaluation point for the Mentor to figure out that you have really understand the nature of the work that needs to be done in the GSOC.

I suggest you also to partially go ahead, as you can, with PR so that they have a valid sample of your coding. I.e. If possibile take an operator that it Is not already already expressed in Tensorflow (i.e. Numpy/PIL) in these referenced repositories so that Mentors could have a feedback about Tensorflow coding ability other than porting.

vinayvr11 · 2020-03-21T12:51:40Z

Thank you very much for this @bhack. Actually i also found some loss functions and optimizers that are listed in tf.contrib but not in Tensorflow 2.x so can i also mention them in proposal.

vinayvr11 · 2020-03-21T15:31:26Z

@bhack : could you please review my proposal this will be a great help for me https://docs.google.com/document/d/1mv32xoGI08JP1wcMiugTyVBzCee7dsyYK_5Uf6SjxEE/edit?usp=sharing

bhack · 2020-03-21T18:02:35Z

@vinayvr11 See @dynamicwebpaige Best Practice on how to collect a feedback.

bhack · 2020-03-22T13:24:04Z

@dynamicwebpaige I don't know how many slots we could have on similar tasks at GSOC but another related "proxy task" could be image text augmentation like CVPR 2020 https://github.com/Canjie-Luo/Text-Image-Augmentation/

gabrieldemarmiesse · 2020-03-22T13:42:16Z

@bhack @vinayvr11 please keep this thread focused on autoaugment and randaugment. Feel free to use direct messages or to open new issues if you think the topic changed. Having an issue with 40+ messages make the life of the maintainers quite hard.

bhack · 2020-03-22T14:06:44Z

@gabrieldemarmiesse probably was better to open a Gitter channel dedicated to Gsoc other than Gitter addons for this kind of threads to not force ISSUES to go off-topic. Google doesn't official support any realtime chat channel and If you see the Google Summer of code Tensorflow page still points to https://github.com/tensorflow/community (you find also off-topic Gsoc issue there).
That repo is mainly used for official RFC also if there Is not a real policy for that repo for ISSUES.
Also IHMO this ISSUE Is quite special, cause involved GSOC in the description and started wtih a very partial overview for a GSOC proposal related to image transformations and polcies. As we have seen just refencing some other repos quickly emergerd the fragmentation of independet Google teams working on this topic.
I think here we are more interested about a general approach to the transformation and policies and IHMO it Is a better target for a GSOC proposal to have a complete overivew other then porting code.
So this moved forward a more general discussion.
For the operations as you seen we have already independent tickets and PR in addons to track the work.

bhack · 2020-04-03T18:30:50Z

How we are going to coordinate with image processing that is landing in https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/keras/layers/preprocessing/image_preprocessing.py?

bhack · 2020-04-03T18:41:45Z

E.g. @haifeng-jin is using that operations in autokeras.

bhack · 2020-04-04T11:38:55Z

/cc @tanzhenyu as from Git blame is working on the image_preprocessing ops in Keras.

tanzhenyu · 2020-04-04T13:15:53Z

image preprocessing ops only contain basic operations like rotate and translate, it doesn't have all the 'advanced' ops like blend, bbox, or augment.
That said, it might be good to have a standalone repo like keras-image with useful layers / utils that relies on tfa.image?

bhack · 2020-04-04T13:36:50Z

I think we need to de-duplicate code as possible not just cause It Is a waste of time but also cause It Is confusing for newcomers (users and candidate contributors).
I.e. If we collect image ops here in TFA i suppose that the natural upstream Is tf.image as in our migration to core process documentation.
What about the mentioned policies in this thread? Are these in the addons perimeter? Or in the Keras standalone perimeter? Or in keras-team/autokeras@49281da?

tanzhenyu · 2020-04-04T13:42:02Z

I think we need to de-duplicate code as possible not just cause It Is a waste of time but also cause It Is confusing for newcomers (users and candidate contributors).
I.e. If we collect image ops here in TFA i suppose that the natural upstream Is tf.image as in our migration to core process documentation.
What about the mentioned policies in this thread? Are these in the addons perimeter? Or in the Keras standalone perimeter? Or in keras-team/autokeras@49281da?

Why upstream? Sounds like tfa.image is a complimentary suite of tf.image?

bhack · 2020-04-04T13:56:40Z

I meant as TF Addons Is historically the fusion of Tensor contrib comunity and keras-contrib comunity we have two official upstream promotion paths as you can see in point 3.
So if we all agree to add new image operation in tfa this mean that the natural promotion path Is tf.image not Keras.
Then there Is the second point that we mentioned in this thread. What about autoaugmentation policies API? Is in TF addon and could be candidate upstream in Keras? Is in Autokeras? Is It in experimental Keras?

tanzhenyu · 2020-04-04T14:54:03Z

I meant as TF Addons Is historically the fusion of Tensor contrib comunity and keras-contrib comunity we have two official upstream promotion paths as you can see in point 3.
So if we all agree to add new image operation in tfa this mean that the natural promotion path Is tf.image not Keras.
Then there Is the second point that we mentioned in this thread. What about autoaugmentation policies API? Is in TF addon and could be candidate upstream in Keras? Is in Autokeras? Is It in experimental Keras?

I agree with @gabrieldemarmiesse, tfa shouldn't rely on Keras-tuner and include learnable policies. It could just be hard-coded. For autoaugmentation policies API it seems a keras-image repo (which will rely on both tfa.image and Keras-tuner, and AutoKeras can rely on that) would be better fit IMHO.

bhack · 2020-04-04T15:38:56Z

I think that we need to discuss this a little bit cause we need to clarify also what Is the difference contributing in keras.experimental and in TF Addons also to route Gsoc students, contributors etc and to not sparsify implementations on multiple repos also cause currently It Is hard to have a monthly overview across the Tensorflow/Keras ecosystems.

tanzhenyu · 2020-04-04T17:17:46Z

I think that we need to discuss this a little bit cause we need to clarify also what Is the difference contributing in keras.experimental and in TF Addons also to route Gsoc students, contributors etc and to not sparsify implementations on multiple repos also cause currently It Is hard to have a monthly overview across the Tensorflow/Keras ecosystems.

Yep we definitely need this to be modular and architectural. I will attend the next meeting

bhack · 2020-04-04T19:03:35Z

Yep we definitely need this to be modular and architectural. I will attend the next meeting

We need to involve somebody from the google-research repo to be sure that Google's internal polices will not create a problem for that team to contribute and feed a policy API under Keras org. As you can see in this thread many of the mentioned auto-augmentations and ops are coming from the google-research repos.
See also the already mentioned tensorflow/community#223

tanzhenyu · 2020-05-27T15:51:56Z

Some follow-up on this: Francois and I discussed and decided that the new keras_cv will include autoaugment and randaugment.

seanpmorgan · 2020-05-28T00:54:58Z

Closing as this is now going to be implemented elsewhere in the ecosystem. Since we've already merged a few components we can deprecate them as they're available.

Lewington-pitsos · 2021-01-24T06:29:28Z

It's been close to a year with no word from on high, keras_cv looks pretty sparse still, has there been any news?

seanpmorgan added Feature Request help wanted Needs help as a contribution image labels Mar 5, 2020

abhichou4 mentioned this issue Mar 9, 2020

Blend Image Op #1275

Merged

abhichou4 mentioned this issue Mar 18, 2020

[1/2] Add Image Ops for Data Augmentation #1333

Closed

13 tasks

bhack mentioned this issue Mar 31, 2020

Ask contribution to Tensorflow addons for general scope utils, loss, layers, ops tensorflow/community#223

Closed

bhack mentioned this issue Apr 4, 2020

experimental API policy tensorflow/community#218

Merged

ghosalsattam mentioned this issue May 2, 2020

RGB to Grayscale #1772

Closed

This was referenced May 4, 2020

Support extended mode on projective transform #1721

Closed

Deprecate ImageProjectiveTransformV2 #1779

Closed

seanpmorgan mentioned this issue May 4, 2020

Formalize process for code merging into tf.image, tfa.image, and keras-preprocessing #1780

Closed

seanpmorgan closed this as completed May 28, 2020

Migrate AutoAugment and RandAugment to TensorFlow Addons. #1226

Migrate AutoAugment and RandAugment to TensorFlow Addons. #1226

Comments

dynamicwebpaige commented Mar 5, 2020

bhack commented Mar 5, 2020

stmugisha commented Mar 5, 2020

bhack commented Mar 5, 2020

bhack commented Mar 5, 2020 • edited Loading

abhichou4 commented Mar 5, 2020 • edited Loading

bhack commented Mar 5, 2020 • edited Loading

fsx950223 commented Mar 6, 2020

seanpmorgan commented Mar 10, 2020 • edited Loading

bhack commented Mar 11, 2020 • edited Loading

abhichou4 commented Mar 11, 2020

bhack commented Mar 11, 2020

bhack commented Mar 11, 2020

carlini commented Mar 11, 2020 • edited Loading

gabrieldemarmiesse commented Mar 16, 2020

bhack commented Mar 16, 2020

gabrieldemarmiesse commented Mar 16, 2020

bhack commented Mar 16, 2020

carlini commented Mar 16, 2020

bhack commented Mar 16, 2020

bhack commented Mar 17, 2020

carlini commented Mar 18, 2020

bhack commented Mar 18, 2020

vinayvr11 commented Mar 18, 2020 • edited Loading

bhack commented Mar 18, 2020

abhichou4 commented Mar 18, 2020

vinayvr11 commented Mar 20, 2020

vinayvr11 commented Mar 21, 2020

bhack commented Mar 21, 2020

vinayvr11 commented Mar 21, 2020

vinayvr11 commented Mar 21, 2020

bhack commented Mar 21, 2020

bhack commented Mar 22, 2020

gabrieldemarmiesse commented Mar 22, 2020

bhack commented Mar 22, 2020 • edited Loading

bhack commented Apr 3, 2020

bhack commented Apr 3, 2020

bhack commented Apr 4, 2020

tanzhenyu commented Apr 4, 2020

bhack commented Apr 4, 2020

tanzhenyu commented Apr 4, 2020 • edited Loading

bhack commented Apr 4, 2020

tanzhenyu commented Apr 4, 2020

bhack commented Apr 4, 2020

tanzhenyu commented Apr 4, 2020

bhack commented Apr 4, 2020

tanzhenyu commented May 27, 2020

seanpmorgan commented May 28, 2020 • edited Loading

Lewington-pitsos commented Jan 24, 2021

bhack commented Mar 5, 2020 •

edited

Loading

abhichou4 commented Mar 5, 2020 •

edited

Loading

bhack commented Mar 5, 2020 •

edited

Loading

seanpmorgan commented Mar 10, 2020 •

edited

Loading

bhack commented Mar 11, 2020 •

edited

Loading

carlini commented Mar 11, 2020 •

edited

Loading

vinayvr11 commented Mar 18, 2020 •

edited

Loading

bhack commented Mar 22, 2020 •

edited

Loading

tanzhenyu commented Apr 4, 2020 •

edited

Loading

seanpmorgan commented May 28, 2020 •

edited

Loading