Cloud Dataflow

Welcome to the Google Cloud Dataflow idea forum. You can submit and vote on ideas here to tell the Google Cloud Dataflow team which features you’d like to see.

This forum is for feature suggestions. If you’re looking for help forums, look here:

We can’t wait to hear from you!

  1. It would be great if the dataflow sdk would work with a newer version of Google Datastore than v1

    Whenever I want to use Dataflow together with Google Datastore, I have to use the old Datastore.v1 version. In this version it seems like I have to encode and decode entities of different kinds manually by extracting the Values (and knowing the type of value) and setting them on a new Object. When I compare this with newer Versions of Datastore or the NodeJs implementation, the handling of datastore-Objects is a dream (i.e. nodeJs just gives you the json-representation). Would it be possible to retrieve Entities either by "selecting" a class type like:
    MyObject mObj = entity.to(MyObject.class)
    or what would…

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  2. Ability to Downscale Google Provided Templates

    We have a job which the basic templated DataFlow job works fairly well for, but so far we cannot see a way to have the machine use fewer machines. Our data ingestion is large and growing, but not yet extremely large. The 3 4 vcpu 15 gig ram machines that are started to process our volume of data are very overkill. I do not see any way to use these basic templates, but to also set the max_workers setting.

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  3. Allow custom logger appenders

    Using a custom log appender (e.g. with logback) inside Dataflow is impossible at the moment. Any logging settings I have seem to be superseded by Google's own appender and just show up in the Dataflow logs in Stackdriver. I want to send my logs to an Elasticsearch cluster, since the rest of my logs which are generated by other non-Dataflow systems are there as well.

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  4. Customizable Columns in Overview Page

    Ability to show total worker time, max number of workers, zone information in the overview page. This should be customizable similar to what we see in the app engine versions page

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  5. dataprep/ Enable to delete more the one dataset

    In the DATASETS tab on dataproc, it would be extremely helpful to check multiple datasets and delete them together

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  6. 1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  7. 1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  8. 1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    2 comments  ·  Flag idea as inappropriate…  ·  Admin →

    Hello!

    I’m not sure I understood the suggestion — perhaps the post is incomplete?

    If you could elaborate further, I’ll be happy to take a look. Thanks!

  9. autodetect

    Enable --autodetect for BigQuery loads, consistent with bq load --autodetect on the command line

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  10. Cannot specify diskSizeGb when launching a template

    When I create a template, It's possible to specify --diskSizeGb, but, If you don't specify it, it's not possible to pass it as parameter

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  11. Bug: Nullpointer when reading from BigQuery

    I believe I'm experiencing a bug in BigQuerySource for apache Beam when running on google dataflow. I described this in details on stackoverflow: https://stackoverflow.com/questions/44718323/apache-beam-with-dataflow-nullpointer-when-reading-from-bigquery/44755305#44755305

    Nobody seems to be able to respond to this. So posting it here as a potential bug.

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  12. dataprep/ Drop columns by a specific rule

    Today in order to remove columns from dataprep I need to manually remove all columns. When dealing with raw datasets I often find myself manually deleting hundreds of columns due to lack of value (e.g. 90% empty). I want to be able to define a rule that will let me delete all the columns according to a specific logical expression (e.g. delete all the columns that have more than 90% empty values)

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  13. Button: "copy dataflow job"

    I would like to be able to copy a dataflow job so that i can tweak the parameters and run it again without having to enter them all in manually.

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
1 2 4 Next →
  • Don't see your idea?

Cloud Dataflow

Categories

Feedback and Knowledge Base