1. Introduction

This document provides a mechanism for documenting Pyomo design conversations.

2. Component Indexing and API

The following describes a component API design for Pyomo. The goal is to document design principles, and provide room for discussion of these principles in this document. This discussion focuses on 6 core components in Pyomo: Set, Var, Param, Objective, Constraint and Expression. We refer to the first three as data components, and the latter three as standard components. As we discuss below, data components are initialized and constructed differently than standard components. Further, standard components reflect the behavior of all other components in Pyomo.

Let’s restrict this discussion with the following assumptions:

  • We only consider the refinement of the existing Pyomo components.

  • We do not include explicit component data objects.

2.1. Simple Components

A simple component is declared by constructing a component without index. Simple components are typically defined with initial values. For example:

# A simple constraint is initialized with the `expr` option
model.c = Constraint(expr=model.x >= 0)

# A simple objective is initialized with the `expr` option
model.o = Objective(expr=model.x)

# A simple expression is initialized with the `expr` option
model.e = Expression(expr=model.x)

Standard components cannot be defined without initial values:

# These declarations raise exceptions
model.c = Constraint()
model.o = Objective()
model.e = Expression()
GH

Exactly 0 of these declarations raise an exception on a ConcreteModel as of Pyomo trunk r10847. I can’t imagine they would behave differently on an AbstractModel either.

  • WEH:: Correct. But this is a design document. I think that they should generate exceptions.

The Set, Param and Var components can be constructed without initial values:

# These declarations define components without initial values
model.A = Set()
model.p = Param()
model.v = Var()

# These declarations define components with initial values
model.B = Set(initialize=[1])
model.q = Param(initialize=1.0)
model.w = Var(initialize=1.0)

The reason for this difference is that these are data components, which define placeholders for data that will be provided later. Set and parameter data can be declared abstractly, and the values of variables is defined during optimization. Hence, these components do not require initial values to specify a model.

For consistency, all Pyomo components support the len() function. By construction, all simple components have length one.

GH

All simple components do NOT have length one. See below:

model = ConcreteModel()
model.c = Constraint()              # len() -> 0
model.o = Objective()               # len() -> 0
model.e = Expression()              # len() -> 1
model.v = Var()                     # len() -> 1
model.s2 = Set(initialize=[1,2])    # len() -> 2
model.s1 = Set(initialize=[1])      # len() -> 1
model.s0 = Set(initialize=[])       # len() -> 0
model.q = Param()                   # len() -> 1
GH

This is far from consistent. Perhaps more intuitive would be for simple components to simply not have a length (because it should be implied that it is a single element). The only simple component that should have a length is a Set object.

  • WEH:: I like your suggestion of only supporting len() for simple Set components. I’ll have think through whether this will create significant backards compatibility issues.

2.2. Indexed Components

An indexed component is declared by constructing a component with one or more index sets. Indexed components do not need to be defined with initial values. For example:

index = [1,2,3]

# Declare a component that can contain 3 sets
model.A = Set(index)

# Declare a component that can contain 3 parameters
model.p = Param(index)

# Declare a component that can contain 3 variables
model.v = Var(index)

# Declare a component that can contain 3 constraints
model.c = Constraint(index)

# Declare a component that can contain 3 objectives
model.o = Objective(index)

# Declare a component that can contain 3 expressions
model.e = Expression(index)

When no initial values are provided, and indexed component does not construct any indexed component data. Hence, the lengths of the components in this example are zero.

There are several standard techniques for initializing indexed components: (1) a rule, (2) explicit addition, and (3) data initialization. The first two options are always supported for standard components. Data components support the last option. For example:

index = [1,2,3]

model.x = Var(index)

# Initialize with a rule
def c_rule(model, i):
    if i == 2:
        return Constraint.Skip
    return model.x[i] >= 0
model.c = Constraint(index, rule=c_rule)

# Explicitly initialize with the add() method.
model.cc = Constraint(index)
model.cc.add(1, model.x[1] >= 0)
model.cc.add(3, model.x[3] >= 0)

This example further illustrates that indexed components can contain component data for a subset of the index set. In this example, the c and cc components have length 2, but the size of the index set is 3.

WEH

Although Gabe has proposed the use of setitem to initialize indexed components, I do not think that we should make that a part of the generic API for all indexed components. It requires that the initial value of the component can be specified with (1) a single data value or (2) a component data object. We’re not allowing (2) in this discussion, and the add() method allows for the specification of an arbitrary number of data values used to initialize a component.

WEH

The BuildAction and BuildCheck components do not currently support the add() method. Hence, the always assertion in the previous paragraph is not true. Does it make sense to add a build action or check?

Data components, along with a variety of other components, support initialization with data. For example:

index = [1,2,3]

model.A = Set(index, initialize={1:[2,4,6]}
model.p = Param(index, initialize={1:1})
model.v = Var(index, initialize={1:1.0})

model.c = Constraint(index, initialize={1: model.v[1] >= 0})

The initialization data specifies the index values that are used to construct the component. Thus, all of the components have length one in this example.

The Param and Var can also be declared with special arguments to create dense configurations:

index = [1,2,3]

# Index '1' has value 1.0.  All other indices are implicitly
# defined with value 0.0.
model.p = Param(index, default=0.0, initialize={1:1.0})

# Densely initialize this component
model.v = Var(index, dense=True)

In this example, both components have length 3. The parameter component is defined with a sparse data representation that has a single component data object. The variable component is declared dense, and it uses three component data objects.

The Param and Var components also allow special semantics for dynamically initializing component data:

index = [1,2,3]

# Mutable parameters allow component data to be defined with the __setitem__
# and __getitem__ methods.
model.p = Param(index, initialize={1:1.0}, mutable=True)
# Here, len(model.p) is 1
model.p[2] = 2.0
# Here, len(model.p) is 2

# Variable components allow component data to be defined with the __setitem__
# or __getitem_ methods.
model.v = Var(index)
# Here, len(model.v) is 0
model.v[1].value = 1
# Here, len(model.v) is 1
vdata = model.v[2]
# Here, len(model.v) is 2
WEH

The implicit definition of component data in these two instances is problematic. For example, simply iterating over the index set and printing mutable parameter or variable values will create component data objects for all indices. However, no obvious, intuitive syntax exists for constructing component data for new indices. The add() method can be used, but this seems burdensome for users. (I looked at other programming languages, like MOSEL, and they also employ implicit initialization of variables.)

3. Flattening Indices

3.1. A Motivating Example

Consider a simple multi-commodity flow model:

from pyomo.environ import *

model = ConcreteModel()

# Sets
model.Nodes = [1,2,3]
model.Edges = [(1,2), (2,1), (1,3), (3,1)]
model.Commodities = [(1,2), (3,2)]

# Variables
model.Flow = Var(model.Commodities,
                 model.Edges,
                 within=NonNegativeReals)

There are a number of ways to interpret this syntax. Focusing on how to access a particular index, one faces the following choices:

  • Flow[c,e] for a commodity c and an edge e

  • Flow[(s,t),(u,v)] for a commodity (s,t) and an edge (u,v)

  • Flow[s,t,u,v] for a commodity (s,t) and an edge (u,v)

A modeler that is fluent in Python knows that the first two bullets are equivalent from the viewpoint of the Python interpretor, and they know that the third bullet is not interpreted the same as the first two. The modeler runs a quick test to determine which of the interpretations is correct:

for c in model.Commodities:
    (s,t) = c
    for e in model.Edges:
        (u,v) = e
        print(model.Flow[c,e])
        print(model.Flow[(s,t),(u,v)])
        print(model.Flow[s,t,u,v])

The modeler does not realize that her decision to use the first and second bullet forms will likely increase the build time of her model by an order of magnitude (see: ./component_container_examples/slow_index.py)

3.2. Considering Unflattened Pyomo Models

Some developers have argued that tuple flattening is the correct approach because we use a similar style of indexing in math programming papers. For example, one might encounter the following notation:

  • $P_{ijk} = d_{jk}\qquad\forall\; i \in V;\; (j,k) \in E$

Advocates of tuple flattening in Pyomo would have you note that the indexing for the variable $P$ is written as $P_{ijk}$ not $P_{i(jk)}$. However, one could argue that we exclude the parentheses for the same reason that we exclude the commas, which is that it reduces clutter, and the human mind, being excellent at disambiguation, is able to extract the inferred meaning from $P_{ijk}$ perhaps more easily without the extra characters. Being the math programmers we are, we of course know that human language (even written mathematical notation) does not translate directly into an algorithm. We include psuedocode in our papers, not machine parsable code. With these comments aside, lets discuss the more fundamental issue with Pyomo’s Set operations.

3.2.1. A Cartesian Product it is Not

The Cartesian product can be defined using set-builder notation as: $X_1\times...\times X_n = \{(x_1,...,x_n)\;|\;x_1\in X1,...,\;x_n\in X_n\}$. One should note from this definition that the Cartesian product is not associative (in the general case where all $X_i \neq \emptyset$). That is, $(A\times B)\times C \neq A\times B\times C \neq A\times(B\times C)$. One should also note that this definition is entirely independent of what elements make up the individual sets (e.g., carrots, objects, real numbers, elements of $\mathbb{R}^3$, 10-tuples).

With this definition in mind, let’s examine a straightforward implementation in Python. Consider how one would implement a Cartesian product across 2 sets $A,B$:

def CartesianProduct(A, B):
   prod = set()
   for a in A:
      for b in B:
         prod.add((a,b))
   return prod

Note that this implementation is sufficiently abstracted from the type of objects that are contained in each of the sets. As far as Python is concerned, if it’s hashable it can live inside of a Set. Let’s make this example concrete by defining the sets $A,B$ as the following:

A = set([(1,2), (2,2)])
B = set(['a', 'b'])

Consider an what an arbitrary element in $x\in A\times B$ looks like:

prod = CartesianProduct(A,B)
print(((1,2),'a') in prod) # -> True
print((1,2,'a') in prod)   # -> False

Now lets translate this example to Pyomo. Our initial attempt might be the following:

model = ConcreteModel()
model.A = Set(initialize=A)
# -> ValueError: The value=(1, 2) is a tuple for set=A, which has dimen=1

Ouch, another error. Let’s fix that:

model = ConcreteModel()
model.A = Set(dimen=2, initialize=A)
model.B = Set(initialize=B)
model.prod = A*B
print(((1,2),'a') in model.prod) # -> False
print((1,2,'a') in model.prod)   # -> True

One will note that the output from the print function shows that the resulting set model.prod violates the definition of a Cartesian product over 2 sets.

GH

This is the part where a new user starts yelling 4-letter words at their computer screen.

Okay, so our first attempt didn’t produce the set we were looking for. Let’s try a different approach. We’ve already computed the Cartesian product using our own function, why don’t we just store that in a Pyomo Set object. Remembering our lesson about defining model.A, and using our knowledge that 2-tuples live inside of our 2-set Cartesian product (by definition) we go with:

model = ConcreteModel()
model.prod = Set(dimen=2, initialize=prod)
# -> ValueError: The value=(1, 2, 'a') is a tuple for set=prod, which has dimen=2

By changing dimen=2 to dimen=3, one ends up with some Python code that doesn’t raise an exception but indeed does not produce a Cartesian product.

GH

At this point, we are asking the user to abandon their intuitive notion that len(<a-tuple>) is equivalent to the dimension of <a-tuple>, and also accept the fact that Pyomo’s definition of cross product (this phrase is used in the published Pyomo book to describe the * operator on Set) is not the same as the definition found on Wikipedia. I am frustrated by this point. Are you?

The question to ask at this point is why.

  • Why is it necessary to declare a dimension for a Set?

    GH

    Is it because it is necessary to disambiguate external data file parsing (like in the DAT format)? Okay great, but I don’t use those, so why is this (or Set for that matter) a dependency for concrete modeling.

  • Why does Pyomo implement the Cartesian product this way?

    GH

    Does it have something to do with AMPL or the previous bullet point? Is it even possible to define a non-associative n-ary operator (set product) by overloading an associative binary operator (*)?

I’ll end this section with another motivating example for an interface that does not have a dependency on Set.

Exercise 1

Use the Python data structure of your choice to store a random value for each element in $A\times B$.

  • GH:: This is my answer:

import random
d = dict(((a,b), random.random()) for a in A for b in B)
# or
d = dict()
for a in A:
   for b in B:
      d[a,b] = random.random()
Exercise 2

Define a Pyomo variable for each element in $A\times B$.

  • GH:: This would be my answer, and I don’t know why we do not want to allow users to extend their thought process in this way. It has been noted in other sections of this document that I am confusing our indexed component containers with dictionaries, and that this might be indicative of a problem with the documentation. I can assure you it has nothing to do with the documentation. It has everything to do with my understanding of how we use the containers in all places in pyomo.* except the construction phase, and the fact that our component emulates dict in every sense of the word except giving users access to setitem. A dict is an intuitive tool for this job, whereas understanding how Pyomo’s Set works is not intuitive.

model.v = VarDict(((a,b), VarData(Reals)) for a in A for b in B)
# or
model.v = VarDict()
for a in A:
   for b in B:
      model.v[a,b] = VarData(Reals)

Here are some potential answers if you are forced to used the current version of Pyomo:

  • Abandon the definition of Cartesian product and simply use model.prod from above to index a standard Var.

  • Use the manually computed version of prod and place it inside a Pyomo SetOf object, and use that to index the variable.

  • Use namedtuple objects instead of pure tuples to trick Pyomo into doing the correct thing.

4. Rethinking Pyomo Components

4.1. Using Explicit Component Data Objects

WEH

Although Gabe has proposed the use of setitem to initialize indexed components, I do not think that we should make that a part of the generic API for all indexed components. It requires that the initial value of the component can be specified with a single data value. The add() method allows for the specification of an arbitrary number of data values used to initialize a component.

  • GH:: It does not necessarily require an initial value, it could be an explicitly constructed object (e.g., x[i] = VarData(Reals)). The use of setitem is a stylistic preference (that I think is intuitive for an interface that already supports half of the dictionary methods). I’m proposing some kind of assignment to an explicitly created object be allowed, and then I’m proposing a discussion about whether or not the implicit forms should be considered. We already use the implicit forms for rule based definitions of Objective, Constraint, and Expression, so it shouldn’t be a stretch to go between the list below. Once these connections are obvious, setitem becomes a natural setting for explicit assignment.

    1. return model.x >= 1 → return ConstraintData(model.x >= 1)

    2. model.c[i] = model.x >= 1 → model.c[i] = ConstraintData(model.x >= 1)

    3. model.c.add(i, model.x >= 1) → model.c.add(i, ConstraintData(model.x >= 1))

4.2. Concrete vs Abstract

WEH

The implicit definition of component data in these two instances is problematic. For example, simply iterating over the index set and printing mutable parameter or variable values will create component data objects for all indices. However, no obvious, intuitive syntax exists for constructing component data for new indices. The add() method can be used, but this seems burdensome for users. (I looked at other programming languages, like MOSEL, and they also employ implicit initialization of variables.)

  • GH:: I agree that these behaviors are problematic. However, there does exist an intuitive way of expressing this kind of behavior. One should simply replace Var(index) with defaultdict(lambda: VarData(Reals)) (ignoring the index checking), and it immediately makes sense, at least, to someone who knows Python, rather than to only the core Pyomo developers who’ve yet to finish changing how it works. If you were to measure the amount of burden placed on a user by (a) requiring them to populate a dictionary (using any of the dict interface methods) or (b) forcing them to spend N+ hours debugging, frustrated by this implicit behavior, I would say that (b) is the more burdensome of the two. I’ve been there, but in my situation I can immediately jump into the source code and figure out if I am crazy or if something needs to be fixed. Most users can not do that.

GH

To summarize, I am NOT saying force everyone to use concrete modeling components. I am saying give them the option of using these more explicit components that don’t suffer from the implicit behaviors we are discussing so that they can get on with finishing their work, while we and they figure out how the abstract interface works. A Pyomo modeler is trying to do two things (1) figure out how to use Pyomo and (2) figure out how to model their problem as a mathematical program. I think intuitive component containers like Dict, List, and a Singleton along with a more explicit syntax for creating and storing optimization objects will allow them to spend less time on (1) and more time on (2).

WEH

FWIW, I think it’s a mistake to assume that users start developing with concrete models and then move to abstract models. There are plenty of contexts where this isn’t true. (For example, PySP.)

  • GH:: (a) PySP does not require abstract models. (b) One would not start with a PySP model. One would start with a deterministic Pyomo model.

4.3. Concrete Component Containers

WEH

Gabe has suggested that we have dictionary and list containers that are used for concrete models. I don’t think we want to do that, but I wanted to reserve some space for him to make his case for that.

4.3.1. Motivating Examples

EXAMPLE: Index Sets are Unnecessarily Restrictive

Consider a simple multi-commodity flow model:

from pyomo.environ import *

model = ConcreteModel()

# Sets
model.Nodes = [1,2,3]
model.Edges = [(1,2), (2,1), (1,3), (3,1)]
model.Commodities = [(1,2), (3,2)]

# Variables
model.Flow = Var(model.Commodities,
                 model.Edges,
                 within=NonNegativeReals)

There are a number of ways to interpret this syntax. Focusing on how to access a particular index, one faces the following choices:

  • Flow[c,e] for a commodity c and an edge e

  • Flow[(s,t),(u,v)] for a commodity (s,t) and an edge (u,v)

  • Flow[s,t,u,v] for a commodity (s,t) and an edge (u,v)

A modeler that is fluent in Python knows that the first two bullets are equivalent from the viewpoint of the Python interpretor, and they know that the third bullet is not interpreted the same as the first two. The modeler runs a quick test to determine which of the interpretations is correct:

for c in model.Commodities:
    (s,t) = c
    for e in model.Edges:
        (u,v) = e
        print(model.Flow[c,e])
        print(model.Flow[(s,t),(u,v)])
        print(model.Flow[s,t,u,v])

To her surprise, all forms seem to work. She panics, and then double checks that she hasn’t been wrong about how the built-in dict type works. She relaxes a bit after verifying that dict does not treat bullet 3 the same as the first two. Gratified in her knowledge that she actually wasn’t misunderstanding how basic Python data structures work, she moves forward on building her Pyomo model, but with a sense of perplexion about Pyomo variables. She decides to stick with the first and second bullet forms where possible, as it is much easier for her and her colleagues to read, and it works with Python dictionaries, which they are using to store data during this initial prototype.

WEH

FWIW, I have yet to hear a user panic (or otherwise raise concerns) about the apparent inconsistency described here. The Flow object is not a dictionary, nor was it advertised as such.

  • GH:: I guess I am her ;) a few years ago. I don’t think I am alone in that when I encounter something I don’t expect, I question what I know, which for programming may lead some to question whether or not they have written broken code because of it.

    • WEH:: I see a consistent thread of your comments where you were treating components as dictionaries, but they weren’t, which you found frustrating. I’m wondering how much of your frustration would be addressed by better documentation.

The modeler makes her first attempt at a flow balance constraint:

def FlowBalanceConstraint_rule(model, c, u):
    out_flow = sum(model.Flow[c,e] for e in model.EdgesOutOfNode[u])
    in_flow = sum(model.Flow[c,e] for e in model.EdgesInToNode[u])
    ...
    return ...
model.FlowBalanceConstraint = Constraint(model.Commodities,
                                         model.Nodes,
                                         rule=FlowBalanceConstraint_rule)

To her dismay she gets the following error:

TypeError: FlowBalanceConstraint_rule() takes exactly 3 arguments (4 given)
Note The modeler’s constraint rule would have worked had she wrapped her set declarations in SetOf(), or had she used something like collections.namedtuple as elements of the set rather than a pure tuple.
GH

We all know what the solution to this example is. However, once we flatten c to s,t in the function arguments, the rule definition for this model is no longer generic. If the dimension of elements in Commodities changes, so does the rule. The workarounds in the note above were not apparent to me until I created these examples. Do we consider them a bug or a feature? Whatever the case may be, for any user who might stumble across these workarounds, it will be far from intuitive why these approaches allow one to write def FlowBalanceConstraint_rule(model, c, u). I would be surprised if any other developers knew of these workarounds as well.

Morals
  • Abstract modeling that involves flattening tuples is not abstract (or generic).

  • Supporting Flow[c,e] should not be the expensive hack that it is.

WEH

I do not follow the conclusion that we need new modeling components. Rather, I think this motivates a reconsideration of the use of argument flattening.

Motivates
  • VarDict: Because I shouldn’t have to go through some confusing tuple flattening nonsense in order to organize a collection of optimization variables into a container. It should be up to me whether Flow should be indexed as Flow[i,j,k,l,p,s,t,o] or Flow[k1, p, k2], and I shouldn’t pay an order of magnitude penalty for the more concise syntax.

  • More intuitive containers for optimization modeling objects that provide more flexibility over how I organize these objects, allowing me to write self-documenting code. E.g., Dict (at a minimum) and List objects for most, if not all, of the component types, including Block. The important pieces are the (currently named) XData objects, and we should be making less of a fuss about how users organize these. There could be extremely trivial and stable implementations of Singleton, Dict, and List containers that anyone familiar with Python would easily understand how to use after reading 20 lines of example code. Example Documentation:

model.x = VarSingleton(Reals, bounds=(0,1))
# Behaves like dict
model.X = VarDict()
for i in range(5,10):
   # using explicit instantiation
   model.X[i] = VarData(Binary, bounds=(0,1))
   # or
   # using implicit instantiation
   model.X[i] = Binary
   model.X[i].setlb(0)
   model.X[i].setub(1)

model.c = ConstraintSingleton(model.x >= 0.5)
# Behaves like list
model.C = ConstraintList()
for i in model.X:
   # using explicit instantiation
   model.C.append(ConstraintData(model.X[i] >= 1.0/i))
   # or
   # using implicit instantiation
   model.C.append(model.X[i] >= 1.0/i)
EXAMPLE: Jagged Index Sets Are Not Intuitive

It is not intuitive why something like this:

model.A = [1,2]
model.B = {1: ['a','b'], 2: ['c','d']}

model.C = ConstraintDict()
for i in model.A:
    for j in model.B[i]:
        model.C[i,j] = ...
        # or
        model.C[i,j] = ConstraintData(...)
WEH

This example could be written without ConstraintDict, so this isn’t a motivating example for ConstraintDict (as is suggested below).

needs to be written as:

model.A = [1,2]
model.B = {1: ['a','b'], 2: ['c','d']}

def C_index_rule(model):
   d = []
   for i in model.A:
       for j in model.B[i]:
           d.append(i,j)
   return d
model.C_index = Set(dimen=2, initialize=C_index_rule)
def C_rule(model, i, j):
    return ...
model.C = Constraint(model.C_index, rule=C_rule):

Note that the use of setitem[] is not the critical take home point from this example. Constraint does have an add() method, and this could be used to fill the constraint in a for loop. It is the construction of the intermediate set that should not be necessary.

WEH

The word needs is too strong here. The first example is for a concrete model, and the second is for an abstract model. You seem be complaining that it’s harder to write an abstract model. To which I respond "so what?"

  • GH:: Agreed. I approach that next. Showing the above motivates the idea that even if you want to use abstract modeling, getting an initial prototype working can be done in a much more concise manner using a concrete approach. That is, concrete modeling can be useful even to people who like abstract modeling. However, the current containers are implemented in such a way as to make straightforward concrete modeling behaviors (such as what is shown below) susceptible to very unintuitive traps brought about by implicit behaviors designed to handle edge cases in the abstract setting.

A more concrete approach using the Constraint component might be to try:

model.A = [1,2]
model.B = {1: ['a','b'], 2: ['c','d']}
model.C_index = [(i,j) for i in model.A for j in model.B[i]]
model.C = Constraint(model.C_index)
RuntimeError: Cannot add component 'C_index' (type <class 'pyomo.core.base.sets\
             .SimpleSet'>) to block 'unknown': a component by that name (type <type 'list'>)\
             is already defined.

If you are lucky, you get a response from the Pyomo forum the same day for this black-hole of an error, and realize you need to do the following (or just never do something as stupid as naming the index for a component <component-name>_index):

model.A = [1,2]
model.B = {1: ['a','b'], 2: ['c','d']}
model.C_index = Set(initialize=[(i,j) for i in model.A for j in model.B[i]])
model.C = Constraint(model.C_index)
for i,j in model.C_index:
   model.C.add((i,j), ...)

Perhaps by accident, you later realize that you can call add() with indices that are not in C_index (without error), leaving you wondering why you defined C_index in the first place.

Morals
  • Defining an explicit index list just to fill something over that index is not intuitive and it takes the fun out of being in Python.

  • The connection between components and their index set is weaker than most developers think. There’s not much point in requiring there even be a connection outside the narrow context of rule-based abstract modeling.

  • Implicit creation of index sets that occurs for Constraint and other indexed components is not intuitive and leads to errors that are impossible to understand. Users have enough to think about when formulating their model. They should be able to script these things in a concrete setting for initial toy prototypes without having to deal with errors that arise from the implicit behaviors related to Set objects (including tuple flattening).

WEH

If the complaint is that our temporary sets get exposed to users and cause errors, I agree.

WEH

If the complain is that our users might not want to use simpler concrete modeling constructs, then I disagree. I don’t think we should move to only support concrete models in Pyomo.

  • GH:: I am not suggesting we only support concrete modeling in Pyomo. I am suggesting we allow concrete modeling to be done separated from these issues. I don’t think this separation can occur without backward incompatible changes to the interface. It is also not clear whether these issues will ever be fully resolved with incremental changes to the current set of component containers. I think the containers I am prototyping and pushing for accomplish two things: (1) provide users with a stable API that is, IMHO, intuitive for many to understand, requires much less code, requires much less documentation, and would not need to change, and (2) provide a stable building block on which the abstract interface can be improved over a realistic time scale.

Motivates
  • ConstraintDict: Because I’m just mapping a set of hashable indices to constraint objects. A MutableMapping (e.g., dict) is a well defined interface for doing this in Python. Why force users to learn a different interface, especially one that doesn’t even exist yet (because we not can agree on what it should look like)?

WEH

No, I don’t think this motivates the use of ConstraintDict. I can use the Constraint object in the example above. If we’re concerned that we have an explicit Set object in the model, then let’s fix that.

WEH

What different interface? What interface doesn’t exist? Why are you forcing me to learn the MutableMapping Python object? (I hadn’t heard of this object before today, so I don’t think you can argue that this will be familiar to Python users.)

  • GH:: The Constraint interface. Is there a concise way to describe the Constraint interface that is well understood and documented. The best I can come up with is "A singleton, dict-like hybrid that supports a subset of the functionality of dict (no setitem), as well as a method commonly associated with the built-in set type (add), along with various Pyomo related methods." The idea of redesigning it (the non-singleton case) as a MutableMapping (whether or not you have heard of that: https://docs.python.org/3/library/collections.abc.html), is that the set of methods it carries related to storing objects is very well documented and can be succinctly described as "behaving like dict".

EXAMPLE: Annotating Models

Model annotations are naturally expressed using a Suffix. Consider some meta-algorithm scripted with Pyomo that requests that you annotate constraints in your model with the type of convex relaxation technique to be employed. E.g.,

model.convexify = Suffix()
model.c1 = Constraint(expr=model.x**2 >= model.y)
model.convexify[model.c1] = 'technique_a'

When you apply this approach to a real model, you are likely to encounter cases like the following:

def c_rule(model, i, j, k, l):
   if (i,j) >= l:
       if k <= i:
           return ...
       else:
           return ...
   else:
       if i+j-1 == l:
           return ...
       else:
           return Constraint.Skip
model.c = Constraint(model.index, rule=c_rule)

How does one annotate this model when only certain indices of constraint c are nonlinear? You copy and paste:

def c_annotate_rule(model, i, j, k, l):
   if (i,j) >= l:
       if k <= i:
           model.confexify[model.c[i,j,k,l]] = 'technique_a'
       else:
           pass
   else:
       if i+j-1 == l:
           pass
       else:
           pass
model.c_annotate = BuildAction(model.index, rule=c_annotate_rule)

It is a bug waiting to happen. It is an unfortunate result of the Abstract modeling framework that there is not a better way to write this. However, it can be written using a single for loop if doing Concrete modeling (or using a BuildAction) AND using a Constraint container that allows it (e.g., ConstraintDict using setitem[] or Constraint using add(). Example:

model.c = ConstraintDict()
for i,j,k,l in model.index:
   if (i,j) >= l:
       if k <= i:
           model.c[i,j,k,l] = ...
           model.confexify[model.c[i,j,k,l]] = 'technique_a'
       else:
           model.c[i,j,k,l] = ...
   else:
       if i+j-1 == l:
           model.c[i,j,k,l] = ...
Motivates
  • Explicit rather than Implicit: Because why do I need to create a set and have something implicitly defined for me, when I can explicitly define the thing inside a for loop and place related logic next to each other (rather than in a copy-pasted identical for loop). Perhaps this is necessary in the narrow scope of rule-based abstract modeling, but it should not be necessary in the context of concrete modeling.

WEH

You are implying that the concrete examples above cannot be supported by Pyomo today. I don’t believe that’s true. Can you confirm?

  • GH:: I can confirm that Pyomo DOES support this today (just use Constraint.add()). But as the example prior to this one points out, using Constraint in a concrete setting is awkward, due to the implicit behaviors of Set as well as the idea that a Constraint without an index is a singleton, but a Constraint with an index can be populated with any number of keys not in that index using Constraint.add() (so why do we force a connection during declaration?). It is very intuitive that when I say something is a dict, it means I’m going to populate it with keys mapping to some set of objects. There should not be a need to declare an index for this dict prior to populating it.

WEH

This does seem to illustrate a limitation of abstract models. But how does this change our design of Pyomo?

  • GH:: The take home from these examples is that concrete modeling in Pyomo is being made unnecessarily awkward by trying to cram both abstract and concrete behavior into a single component that behaves both as a singleton and dict-like object. Concrete modeling should be made easier and more intuitive, since this is THE place to start for testing or debugging a model. Picture firing up the python interactive interpreter and typing the five lines necessary to figure out the behavior for component A in some context, I’m not going to create a separate file data to do this (unless the problem has to do with importing data). I can’t necessarily know if the problem has to do with importing data unless I verify that I understand the intended behavior with concrete components.

4.3.2. Extending to Other Components

As of r10847, Pyomo trunk includes a Dict prototype for Expression, Objective, and Constraint. Extending this functionality to Block and Var would not be a difficult undertaking. This would necessarily include:

  1. Deciding on a pure abstract interface for BlockData and VarData.

  2. Implementing a general purpose version of this interface.

  3. A developer discussion about implicit vs. explicit creation of XData components. E.g., VarDict[i,j] = Reals vs. VarDict[i,j] = VarData(Reals), and whether or not we support the implicit form never, or only for some components. For instance BlockData(), shouldn’t require any arguments (that I can think of), so supporting implicit creation during an explicit assignment is a bit goofy (e.g., BlockDict[i,j] = None?).

    WEH

    I think we need to discuss the pure abstract interface that you refer to. Although I’ve seen the commits you made recently, I don’t understand why they are required.

4.3.3. A List Container

I’m less attached to this idea. But if you support XDict, it’s hard to think of any reason why NOT to provide XList.

WEH

I don’t think we need VarDict because we already have Var, and I don’t think we need VarList because we already have VarList. I’m not seeing what a different component layer adds to Pyomo.

GH

The list of inconsistencies and awkward behaviors that have been discussed throughout the document above is far from complete. Drawing on my experiences as a developer that has tried to make Pyomo core more intuitive in the concrete setting, the only conclusion I can draw at this point is that we need a cleaner separation of the concrete and abstract interfaces. I know we all want to make Pyomo better, but we have different ideas about these core components, and I have no doubt that Pyomo core will continue to go back and forth with these issues as long as an abstract and concrete interface try to live in the same component container. IMHO, I think designing a Concrete-only interface that we all agree upon will be a trivial exercise. Additionally, I think rebuilding ALL of the current abstract functionality on top of these concrete building blocks is another trivial exercise (we can even include the current inconsistencies). We can provide the stable concrete interface now, and work on improvements and fixes to the abstract interface that would necessarily take place over a longer time period because of backward incompatibility concerns as well as developer disagreement over what the correct behavior is.