Develop a new negotiator
------------------------

In this section, we go through the process of developing a new agent for
the Stacked Alternating offers Protocols. Firstly, let’s look at an
example from the previous tutorial in which we simulate a buyer and a
seller negotiating over a deal using time-based negotiators.

.. code:: ipython3

    # create negotiation agenda (issues)
    issues = [
        make_issue(name="price", values=10),
        make_issue(name="quantity", values=(1, 11)),
        make_issue(name="delivery_time", values=10),
    ]

    # create the mechanism
    session = SAOMechanism(issues=issues, n_steps=20)

    # define ufuns
    seller_utility = LUFun(
        values={
            "price": IdentityFun(),
            "quantity": LinearFun(0.2),
            "delivery_time": AffineFun(-1, bias=9),
        },
        weights={"price": 1.0, "quantity": 1.0, "delivery_time": 10.0},
        outcome_space=session.outcome_space,
        reserved_value=15,
    ).scale_max(1.0)
    buyer_utility = LUFun(
        values={
            "price": AffineFun(-1, bias=9.0),
            "quantity": LinearFun(0.2),
            "delivery_time": IdentityFun(),
        },
        outcome_space=session.outcome_space,
        reserved_value=10,
    ).scale_max(1.0)

    session.add(AspirationNegotiator(name="buyer"), ufun=buyer_utility)
    session.add(AspirationNegotiator(name="seller"), ufun=seller_utility)
    session.run()
    session.plot()
    plt.show()


.. image:: 03.develop_new_negotiator_files/03.develop_new_negotiator_1_0.png


The negotiation ended with an agreement far from the pareto-front
(pareto-distance = :math:`0.31`) which does not seem like a good result.
What is the problem?

Looking carefully at the 2D representation of the negotiation above, we
can immediately see the issue: uninformed concession. Consider the buyer
agent. It started with offering the best outcome for itself (The green
offers above) and repeated it for a while (as expected) then started
conceding (i.e. offering outcomes with lower utility for itself.
Nevertheless, when it did concede, it did not consider its partner at
all. The figure below shows the same figure focusing on one specific
choice the buyer did: |Uninformed Concession|

The problem is highlighted in orange. Even though the buyer had several
offers that are *of the same utility for itself*, they are not of the
same utility *to its partner*. In the figure, it is clear that the
buyer, chose the offer that was in fact the *worst* for its opponent.
Choosing any other offer in the orange rectangle could have been better
as it is nearer to the pareto-front. By offering this way, the partners
are *leaving money on the table*.

Could the buyer have done better? Yes. By the time it gave this offer,
it already have received several offers from the seller which could have
been used to *estimate* the utility of different outcomes the buyer is
cosidering offering for the seller (highlighted in red).

| The ideas we want to implement is pretty simple:
| **Use a time-based consession strategy but always offer the outcome
  nearest to the first offer of the partner**

The intuition behind this simple strategy relies on two assumptions:

1. The partner’s first offer is most likely its best outcome. Most
   negotiators will start with the best outcome for themselves.
2. Nearer outcomes in the outcome-space, are likely to have similar
   utilities (i.e. the utility function of the partner is smooth).

Both of these assumptions are sometimes violated but our goal here is to
develop a *simple* yet useful negotiator not to end once and for all the
hunt for the most effective negotiation strategy. With that said, let’s
dive in.

Building a random negotiator
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

As a first step, we will build a negotiator that acts randomly.

Let’s assume that we are too lazy to even read the documentation and
want to learn how to develop a negotiator for the stacked alternating
offers protocol. The first thing is to create a negotiator class and see
what are the methods we need to override. All negotiators for the SAO
mechanism should inherit from the ``SAONegotiator`` base class. Let’s
try to do that

.. |Uninformed Concession| image:: uninformed.png

.. code:: ipython3

    class RandomNegotiator(SAONegotiator):
        ...


    try:
        RandomNegotiator()
    except Exception as e:
        print(e)

This is telling us that there is one (and only one) required abstract
method that we need to override called ``propose()``. This is the
signature of this method:

.. code:: python

   def proposed(self, state: SAOState) -> Outcome:
       ...

It receives the negotiation ``state`` which has all information
available to the negotiator about the current state of the negotiation
and generates an outcome to *offer* to the opponent. That is it.
Moreover, we should know that the negotiator always have access to a
``NegotiatorMechanismInterface`` object that gives it unchanging
information about the negotiation (for example the number of allowed
rounds, any real-time limits on the negotiation, the number of partners,
etc). This interface is accessible through the ``nmi`` member of the
negotiator. With this knowledge, we can build our first negotiator which
will simply offer randomly.

.. code:: ipython3

    class RandomNegotiator(SAONegotiator):
        def propose(self, state, dest: str | None = None):
            return self.nmi.random_outcomes(1)[0]

Let’s define a helper function for testing our negotiator that replaces
the buyer and/or seller negotiators in the code sample we used above:

.. code:: ipython3

    def try_negotiator(cls, replace_buyer=True, replace_seller=True, plot=True, n_steps=20):
        buyer_cls = cls if replace_buyer else AspirationNegotiator
        seller_cls = cls if replace_seller else AspirationNegotiator

        # create negotiation agenda (issues)
        issues = [
            make_issue(name="price", values=10),
            make_issue(name="quantity", values=(1, 11)),
            make_issue(name="delivery_time", values=10),
        ]

        # create the mechanism
        session = SAOMechanism(issues=issues, n_steps=n_steps)

        # define ufuns
        seller_utility = LUFun(
            values={
                "price": IdentityFun(),
                "quantity": LinearFun(0.2),
                "delivery_time": AffineFun(-1, bias=9),
            },
            weights={"price": 1.0, "quantity": 1.0, "delivery_time": 10.0},
            outcome_space=session.outcome_space,
            reserved_value=15.0,
        ).scale_max(1.0)
        buyer_utility = LUFun(
            values={
                "price": AffineFun(-1, bias=9.0),
                "quantity": LinearFun(0.2),
                "delivery_time": IdentityFun(),
            },
            outcome_space=session.outcome_space,
            reserved_value=10.0,
        ).scale_max(1.0)

        session.add(buyer_cls(name="buyer"), ufun=buyer_utility)
        session.add(seller_cls(name="seller"), ufun=seller_utility)
        session.run()
        if plot:
            session.plot()
            plt.show()
        return session

… and try our first attempt:

.. code:: ipython3

    s = try_negotiator(RandomNegotiator)


.. image:: 03.develop_new_negotiator_files/03.develop_new_negotiator_9_0.png


What just happened? It seems that the buyer offered a single offer which
was **immediately** accepted by the seller. We can check that explicitly
by looking at the negotiation *trace* which stores all the offers
exchanged (along with the agent that offered it):

.. code:: ipython3

    s.trace


.. parsed-literal::

    [('buyer-bd2870df-6a60-4a8b-8c78-1cbe4e8583d4', (3, 5, 8)),
     ('seller-41ae35d0-af57-46d3-970c-165fe031aa13', (5, 10, 2)),
     ('buyer-bd2870df-6a60-4a8b-8c78-1cbe4e8583d4', (2, 10, 5)),
     ('seller-41ae35d0-af57-46d3-970c-165fe031aa13', (9, 1, 9)),
     ('buyer-bd2870df-6a60-4a8b-8c78-1cbe4e8583d4', (6, 8, 7))]


Why did this happen? To answer this question, let’s try to run another
negotiation but replacing only the buyer

.. code:: ipython3

    s2 = try_negotiator(RandomNegotiator, replace_seller=False)


.. image:: 03.develop_new_negotiator_files/03.develop_new_negotiator_13_0.png


The seller behaves as the time-based aspiration negotiator is expected
to behave. It starts at its best outcome then it concedes slowly. Our
random buyer agent also seems to behave as expected, it offers outcomes
all over the place. What happened in this case, is that the buyer
accepted some offer from the seller. How did it decide to do so? We did
not implement a way for our negotiator to make this decision.

The default acceptance strategy in NegMAS is to accept an outcome **if
and only if it has a utility for the negotiator better or equal to
whatever offer it would have proposed at this negotiation state**.

So this is what happened, the buyer agent received some offer from the
aspriation negotiator, it called our ``propose`` method to see what
outcome would it have offered. Because our ``propose`` behaved randomly,
it returned some outcome that has a utility less than or equal to the
utility for the buyer of the seller’s offer and that is why it accepted.

It is clear that the default acceptance strategy in NegMAS does not make
sense for our random negotiator (not that random offering makes sense in
the first place :-) ).

**Can you see why the first negotiation we attempted between our two
random agents ended up at the first offer?**

Let’s test your answer by checking if it explains what happens when we
repeat the process and plot a histogram of the step (round) at which the
negotiation ended.

.. code:: ipython3

    ended_at = [
        try_negotiator(RandomNegotiator, plot=False).state.step for _ in range(1000)
    ]
    plt.hist(ended_at)
    plt.show()


.. image:: 03.develop_new_negotiator_files/03.develop_new_negotiator_15_0.png


Better acceptance strategy
~~~~~~~~~~~~~~~~~~~~~~~~~~

So how can we slightly improve our random negotiator. We can make it
accept offers only if they are above some threshold. To do that we need
to override the ``respond`` method which is used by the ``SAOMechanism``
to check if an outcome is acceptable for the negotiator. It has the
following signature:

.. code:: python

   def respond(self, state: SAOState, offer: Outcome, nid: str) -> ResponseType:
       ...

The ``ResponseType`` returned is an enum with different possible
options. We are only interested in three of them:

- ACCEPT_OFFER: Accept
- REJECT_OFFER: Reject
- END_NEGOTIATION: End the negotiation immediately

Here is how we can add our acceptance strategy:

.. code:: ipython3

    class BetterRandomNegotiator(RandomNegotiator):
        def respond(self, state, source: str = ""):
            offer = state.current_offer
            if self.ufun(offer) > 0.9:
                return ResponseType.ACCEPT_OFFER
            return ResponseType.REJECT_OFFER

The only new thing for us here is that the negotiator can access it
*own* utility function using ``self.ufun``. Let’s try to replace both
agents with our slightly better random negotiator

.. code:: ipython3

    s3 = try_negotiator(BetterRandomNegotiator)


.. image:: 03.develop_new_negotiator_files/03.develop_new_negotiator_19_0.png


Now *both* agents are proposing randomly. How can we check that our
*complicated* acceptance strategy is implemented correctly?

We can check that the agent that accepted the final offer (the seller in
this case) had a utility above *0.8*. To do that we need to know a
little bit about the ``state`` object which we receive in both
``propose`` and ``respond`` and can access at any time on the mechanism
object using the ``state`` property. Here is the final state of the
negotiation:

.. code:: ipython3

    print(s3.state)


.. raw:: html

    <pre style="white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace"><span style="color: #800080; text-decoration-color: #800080; font-weight: bold">SAOState</span><span style="font-weight: bold">(</span>
        <span style="color: #808000; text-decoration-color: #808000">running</span>=<span style="color: #ff0000; text-decoration-color: #ff0000; font-style: italic">False</span>,
        <span style="color: #808000; text-decoration-color: #808000">waiting</span>=<span style="color: #ff0000; text-decoration-color: #ff0000; font-style: italic">False</span>,
        <span style="color: #808000; text-decoration-color: #808000">started</span>=<span style="color: #00ff00; text-decoration-color: #00ff00; font-style: italic">True</span>,
        <span style="color: #808000; text-decoration-color: #808000">step</span>=<span style="color: #008080; text-decoration-color: #008080; font-weight: bold">3</span>,
        <span style="color: #808000; text-decoration-color: #808000">time</span>=<span style="color: #008080; text-decoration-color: #008080; font-weight: bold">0.0007858330063754693</span>,
        <span style="color: #808000; text-decoration-color: #808000">relative_time</span>=<span style="color: #008080; text-decoration-color: #008080; font-weight: bold">0.19047619047619047</span>,
        <span style="color: #808000; text-decoration-color: #808000">broken</span>=<span style="color: #ff0000; text-decoration-color: #ff0000; font-style: italic">False</span>,
        <span style="color: #808000; text-decoration-color: #808000">timedout</span>=<span style="color: #ff0000; text-decoration-color: #ff0000; font-style: italic">False</span>,
        <span style="color: #808000; text-decoration-color: #808000">agreement</span>=<span style="font-weight: bold">(</span><span style="color: #008080; text-decoration-color: #008080; font-weight: bold">3</span>, <span style="color: #008080; text-decoration-color: #008080; font-weight: bold">11</span>, <span style="color: #008080; text-decoration-color: #008080; font-weight: bold">0</span><span style="font-weight: bold">)</span>,
        <span style="color: #808000; text-decoration-color: #808000">results</span>=<span style="color: #800080; text-decoration-color: #800080; font-style: italic">None</span>,
        <span style="color: #808000; text-decoration-color: #808000">n_negotiators</span>=<span style="color: #008080; text-decoration-color: #008080; font-weight: bold">2</span>,
        <span style="color: #808000; text-decoration-color: #808000">has_error</span>=<span style="color: #ff0000; text-decoration-color: #ff0000; font-style: italic">False</span>,
        <span style="color: #808000; text-decoration-color: #808000">error_details</span>=<span style="color: #008000; text-decoration-color: #008000">''</span>,
        <span style="color: #808000; text-decoration-color: #808000">erred_negotiator</span>=<span style="color: #008000; text-decoration-color: #008000">''</span>,
        <span style="color: #808000; text-decoration-color: #808000">erred_agent</span>=<span style="color: #008000; text-decoration-color: #008000">''</span>,
        <span style="color: #808000; text-decoration-color: #808000">threads</span>=<span style="font-weight: bold">{}</span>,
        <span style="color: #808000; text-decoration-color: #808000">last_thread</span>=<span style="color: #008000; text-decoration-color: #008000">''</span>,
        <span style="color: #808000; text-decoration-color: #808000">current_offer</span>=<span style="font-weight: bold">(</span><span style="color: #008080; text-decoration-color: #008080; font-weight: bold">3</span>, <span style="color: #008080; text-decoration-color: #008080; font-weight: bold">11</span>, <span style="color: #008080; text-decoration-color: #008080; font-weight: bold">0</span><span style="font-weight: bold">)</span>,
        <span style="color: #808000; text-decoration-color: #808000">current_proposer</span>=<span style="color: #008000; text-decoration-color: #008000">'buyer-63fd1074-5d79-4f49-9226-32fe564bf016'</span>,
        <span style="color: #808000; text-decoration-color: #808000">current_proposer_agent</span>=<span style="color: #800080; text-decoration-color: #800080; font-style: italic">None</span>,
        <span style="color: #808000; text-decoration-color: #808000">n_acceptances</span>=<span style="color: #008080; text-decoration-color: #008080; font-weight: bold">2</span>,
        <span style="color: #808000; text-decoration-color: #808000">new_offers</span>=<span style="font-weight: bold">[(</span><span style="color: #008000; text-decoration-color: #008000">'buyer-63fd1074-5d79-4f49-9226-32fe564bf016'</span>, <span style="font-weight: bold">(</span><span style="color: #008080; text-decoration-color: #008080; font-weight: bold">3</span>, <span style="color: #008080; text-decoration-color: #008080; font-weight: bold">11</span>, <span style="color: #008080; text-decoration-color: #008080; font-weight: bold">0</span><span style="font-weight: bold">))]</span>,
        <span style="color: #808000; text-decoration-color: #808000">new_offerer_agents</span>=<span style="font-weight: bold">[</span><span style="color: #800080; text-decoration-color: #800080; font-style: italic">None</span><span style="font-weight: bold">]</span>,
        <span style="color: #808000; text-decoration-color: #808000">last_negotiator</span>=<span style="color: #008000; text-decoration-color: #008000">'buyer'</span>,
        <span style="color: #808000; text-decoration-color: #808000">current_data</span>=<span style="color: #800080; text-decoration-color: #800080; font-style: italic">None</span>,
        <span style="color: #808000; text-decoration-color: #808000">new_data</span>=<span style="font-weight: bold">[(</span><span style="color: #008000; text-decoration-color: #008000">'buyer-63fd1074-5d79-4f49-9226-32fe564bf016'</span>, <span style="color: #800080; text-decoration-color: #800080; font-style: italic">None</span><span style="font-weight: bold">)]</span>
    <span style="font-weight: bold">)</span>
    </pre>


Some of these state variables are specific to the ``SAOMechanism`` but
others are common to all mechanisms (i.e. available in the
``MechainsmState`` class which is the parent of ``SAOState``). Let’s
check some of these first:

Negotiation execution state:

- **started** Did the negotiation start?
- **running** Is the negotiation still running?
- **waiting** Is the negotiation waiting for some response from one of
  the negtiators?
- **has_errors** Does the negotiation have any exceptions?

Negotiation end state:

- **broken** Did a negotiator end the negotiation (by returning
  ``ResponseType.END_NEGOTIATION`` from its ``respond()`` method).
- **timedout** The negotiation timed out without agreement.
- **agreement** The final agreement (or ``None`` if broken or timedout).

Timing state:

- **step** The current negotiation step (here it is *9* out of the *20*
  steps allowed)
- **time** The real time that passed since the negotiation stareted
- **relative_time** The fraction of the negotiation that passed (here it
  is :math:`(9+1)/(20+1=0.476...`).

There are also SAO specific state variables:

The most important for us are:

- **current_offer** which will be the same as the agreement as the
  negotiation has already ended.
- **current_proposer** The ID of the negotiator that proposed the
  ``current_offer``.

Using this information, we can confirm the utility value of the
agreement for the agent that accepted it as follows:

.. code:: ipython3

    negotiator_ids = [_.id for _ in s3.negotiators]
    acceptor = [i for i, _ in enumerate(negotiator_ids) if _ != s3.state.current_proposer][
        0
    ]
    print(s3.negotiators[acceptor].ufun(s3.agreement))


.. raw:: html

    <pre style="white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace"><span style="color: #008080; text-decoration-color: #008080; font-weight: bold">0.9407114624505929</span>
    </pre>


Seems OK.

Parameterizing the Negotiator
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

One issue with our negotiator is that the acceptance threshold is
hard-coded. We can add parameters to the negotiator while keeping the
default parameters of all negotiators as follows:

.. code:: ipython3

    class BetterRandomNegotiator(RandomNegotiator):
        def __init__(self, *args, acceptance_threshold=0.8, **kwargs):
            super().__init__(*args, **kwargs)
            self._th = acceptance_threshold

        def respond(self, state, offer, nid: str):
            if self.ufun(offer) > self._th:
                return ResponseType.ACCEPT_OFFER
            return ResponseType.REJECT_OFFER

Smart Aspiration Negotiator
~~~~~~~~~~~~~~~~~~~~~~~~~~~

We now turn our attention to developing our smart aspiration negotiator:
*concede as AspirationNegotiator, but offer the nearest outcome at a
given utility level to the opponent’s first offer*

To do that, we need to be able to find all outcomes above some utility
threshold. To do that, we will use a class defined by NegMAS called
``InverseUtilityFunction``. In general, negotiators in NegMAS should
expect that the ufun may change at any time during the negotiation. Our
negotiator will need to re-calculate the utility value associated with
each outcome at every ufun change. It can do that in the
``on_preferences_changed()`` callback.

Moreover, we need some way to calculate the current utility level we are
willing to accept (and to offer around). Here we can use another
component from NegMAS called ``PolyAspiration`` which is designed
exactly for that. Let’s see what the negotiator looks like and then
explain it:

.. code:: ipython3

    from random import choice
    from negmas import PolyAspiration, PresortingInverseUtilityFunction


    class SmartAspirationNegotiator(SAONegotiator):
        _inv = None  # The ufun invertor (finds outcomes in a utility range)
        _partner_first = None  # The best offer of the partner (assumed best for it)
        _min = None  # The minimum of my utility function
        _max = None  # The maximum of my utility function
        _best = None  # The best outcome for me

        def __init__(self, *args, **kwargs):
            # initialize the base SAONegoiator (MUST be done)
            super().__init__(*args, **kwargs)

            # Initialize the aspiration mixin to start at 1.0 and concede slowly
            self._asp = PolyAspiration(1.0, "boulware")

        def on_preferences_changed(self, changes):
            # create an initialize an invertor for my ufun
            changes = [_ for _ in changes if _.type not in (PreferencesChangeType.Scale,)]
            if not changes:
                return
            self._inv = PresortingInverseUtilityFunction(self.ufun)
            self._inv.init()

            # find worst and best outcomes for me
            worest, self._best = self.ufun.extreme_outcomes()

            # and the corresponding utility values
            self._min, self._max = self.ufun(worest), self.ufun(self._best)

            # MUST call parent to avoid being called again for no reason
            super().on_preferences_changed(changes)

        def respond(self, state, source: str):
            offer = state.current_offer
            if offer is None:
                return ResponseType.REJECT_OFFER
            # set the partner's first offer when I receive it
            if not self._partner_first:
                self._partner_first = offer

            # accept if the offer is not worse for me than what I would have offered
            return super().respond(state, source)

        def propose(self, state, dest: str | None = None):
            # calculate my current aspiration level (utility level at which I will offer and accept)
            a = (self._max - self._min) * self._asp.utility_at(
                state.relative_time
            ) + self._min

            # find some outcomes (all if the outcome space is  discrete) above the aspiration level
            outcomes = self._inv.some((a - 1e-6, self._max + 1e-6), False)
            # If there are no outcomes above the aspiration level, offer my best outcome
            if not outcomes:
                return self._best

            # else if I did not  receive anything from the partner, offer any outcome above the aspiration level
            if not self._partner_first:
                return choice(outcomes)

            # otherwise, offer the outcome most similar to the partner's first offer (above the aspiration level)
            nearest, ndist = None, float("inf")
            for o in outcomes:
                d = sum((a - b) * (a - b) for a, b in zip(o, self._partner_first))
                if d < ndist:
                    nearest, ndist = o, d
            return nearest

Let’s look at this negotiator in details. We override four methods: -
**init\ ()** to initialize the negotiator. This method should **always**
call ``super().__init__()`` to correctly initialize the negotiator.
Moreover, we initialize the aspiration mixin to slowly concede from
zero. - **on_preferences_changed(changes)** to update the ufun inverter,
my ufun’s range and find out the best outcome. \*You must call the
parent’s implementation using ``super().on_preferences_changed()`` to
avoid unnecessary repeated calls to this method. - **respond()** to
implement our acceptance strategy. In this case the default NegMAS
strategy is OK for us (called in the last line). We only need to save
the partner’s first offer here to use it in our offering strategy. -
**propose()** This is the core of the negotiator and implements its
offering strategy. Let’s look to it line by line:

1. Calculate the current aspiration level which is the utility level
   above which we are going to offer

.. code:: python

      a = (self._max - self._min) * self.utility_at(state.relative_time) + self._min

2. Find outcomes above my aspiration level. Note here that we use
   ``some()`` instead of ``all()`` to be compatible with continuous
   outcome spaces

.. code:: python

     outcomes = self._inv.some((a, self._max), False)

3. We are now ready to generate our offer. We need to consider three
   cases:

::

   - No outcomes were found above the given threshold. Here we just offer our best offer
   ```python
   if not outcomes:
       return self._best
   ```
   - We do not know the partner's first offer (i.e. we are the first to offer in the negotiation). Here we just choose any outcome from the list `outcomes` (i.e. those above the aspiration level)
   ```python
   if not outcomes:
       return self._best
   ```

   - We have the partner's first offer. In this case, we find the distance between each of the outcomes we have (above the aspiration level) and the partner's first offer using Euclidean distance:
   ```python
   d = sum((a - b) * (a - b) for a, b in zip(o, self._partner_first))
   ```

Can you see some of the hidden assumptions in this negotiator?

While you are thinking about that, let’s check our new negotiator:

.. code:: ipython3

    s = try_negotiator(SmartAspirationNegotiator)


.. image:: 03.develop_new_negotiator_files/03.develop_new_negotiator_29_0.png


As you can see, now the agreement is on the pareto front which means no
money left on the table (i.e. it is impossible to increase the utility
of one partner without decreasing the utility of the other).

That is a single negotiation though. Let’s compare our new negotiator
with ``AspriationNegotiator`` on multiple negotiations:

.. code:: ipython3

    from collections import defaultdict

    # find the pareto-frontier (it is the same for all negotiations)
    frontier_utils, frontier_outcomes = s.pareto_frontier()
    nash_utils, nash_outcome = s.nash_points()[0]
    nash_welfare = sum(nash_utils)


    # define the distance (Euclidean) to pareto frontier
    def ed(a, b):
        return math.sqrt(sum((x - y) ** 2 for x, y in zip(a, b)))


    def pareto_dist(a, frontier):
        # find the distance to the pareto-front (in outcome-space units)
        return min(ed(a, b) for b in frontier)


    def nash_diff(a, nash_welfare):
        # find the difference in total welfare between the agreement and nash-agreement
        return nash_welfare - sum(_.ufun(a) for _ in s.negotiators)


    # collect data about distance of the agreement to the pareto frontier
    n, pdist, ndiff = 100, defaultdict(float), defaultdict(float)
    for _ in range(n):
        for cls in (AspirationNegotiator, SmartAspirationNegotiator, RandomNegotiator):
            a = try_negotiator(cls, plot=False).state.agreement
            pdist[cls.__name__] += pareto_dist(a, frontier_outcomes) / n
            ndiff[cls.__name__] += nash_diff(a, nash_welfare) / n

    print(
        f"Distance to Pareto Frontier: {dict(pdist)}\nDistance to the Nash Bargaining Solution: {dict(ndiff)}"
    )


.. raw:: html

    <pre style="white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace">Distance to Pareto Frontier: <span style="font-weight: bold">{</span><span style="color: #008000; text-decoration-color: #008000">'AspirationNegotiator'</span>: <span style="color: #008080; text-decoration-color: #008080; font-weight: bold">4.99999999999999</span>, <span style="color: #008000; text-decoration-color: #008000">'SmartAspirationNegotiator'</span>: <span style="color: #008080; text-decoration-color: #008080; font-weight: bold">0.0</span>,
    <span style="color: #008000; text-decoration-color: #008000">'RandomNegotiator'</span>: <span style="color: #008080; text-decoration-color: #008080; font-weight: bold">5.85872665597123</span><span style="font-weight: bold">}</span>
    Distance to the Nash Bargaining Solution: <span style="font-weight: bold">{</span><span style="color: #008000; text-decoration-color: #008000">'AspirationNegotiator'</span>: <span style="color: #800080; text-decoration-color: #800080; font-weight: bold">np.float64</span><span style="font-weight: bold">(</span><span style="color: #008080; text-decoration-color: #008080; font-weight: bold">0.3953547528665905</span><span style="font-weight: bold">)</span>,
    <span style="color: #008000; text-decoration-color: #008000">'SmartAspirationNegotiator'</span>: <span style="color: #800080; text-decoration-color: #800080; font-weight: bold">np.float64</span><span style="font-weight: bold">(</span><span style="color: #008080; text-decoration-color: #008080; font-weight: bold">0.09861855750792485</span><span style="font-weight: bold">)</span>, <span style="color: #008000; text-decoration-color: #008000">'RandomNegotiator'</span>: <span style="color: #800080; text-decoration-color: #800080; font-weight: bold">np.float64</span><span style="font-weight: bold">(</span><span style="color: #008080; text-decoration-color: #008080; font-weight: bold">0.32620749031424884</span><span style="font-weight: bold">)}</span>
    </pre>


It is clear that our negotiator achieved its goal. It reduces the
distance to the pareto-front of the final agreement compared with
vanilla ``AspirationNegotiator`` (``pdist``) to zero while reducing the
difference in total welfare (utility sum) between the agreement and the
best possible value (at the nash-point) by almost :math:`70`\ %. Can you
think of ways to further improve this design?

Back to our earlie question: *Can you see some of the hidden assumptions
in this negotiator?* Here are some answers:

1. We implicitly assume that there is a meaningful distance measure
   defined over the outcome space. This is certainly not be the case if
   some of the outcomes are not cardinal. In our example, all outcomes
   are numeric but is it really meaningful to treat one day on the
   delivery issue as equal to one item as equal to one dollar? What can
   we do to avoid that? We can approximate distance over these issues by
   either matching (0) or mismatching (1). Moreover, we can consider the
   average matching score for all of the partner’s offers so far instead
   of only the first one. Try to implement that. You will need to access
   the Negotiator-Mechanism-Interface (NMI) to get the negotiation
   issues using: ``self.nmi.outcome_space``.
2. Our aspiration mixin assumes that the minimum value for aspiration is
   the reserved value instead of zero which does not match the way we
   use it in ``propose()``. In our case, reserved values *were* zero so
   this had no effect. In a general negotiation though, the reserved
   value should be taken into account.

Now that you have some experience developing a negotiating agent, try to
improve the design by handling these two issues.

Running a tournament between negotiators
----------------------------------------

When evaluating your shiny new negotiator, you may want to run it
against other negotiators on a set of negotiation scenarios to evaluate
its performance. NegMAS simplifies this process by providing two types
of negotiation tournaments:

- neg_tournament() is designed to run a number of **competitor
  negotiators** against a common set of opponents.
- cartesian_tournament() is designed to run a number of competitors
  against **each other** as well as some other optional non-competing
  negotiators.

Firstly, we need to generate a set of ``Scenario``\ s to use for the
tournament. This is a simple function that generates ``n`` random
scenarios with two issues each:

.. code:: ipython3

    from negmas.tournaments.neg import cartesian_tournament

    from negmas.gb.negotiators.timebased import (
        BoulwareTBNegotiator,
        ConcederTBNegotiator,
        LinearTBNegotiator,
    )
    from negmas.inout import Scenario
    from negmas.outcomes import make_issue
    from negmas.outcomes.outcome_space import make_os
    from negmas.preferences import LinearAdditiveUtilityFunction as U
    from negmas.tournaments.neg import cartesian_tournament
    from negmas.helpers import humanize_time
    import time


    def get_scenarios(n=2) -> list[Scenario]:
        # generates/reads the set of scenarios to be used in the tournament

        # Negotiation Issues
        issues = (
            make_issue([f"{i}" for i in range(10)], "quantity"),
            make_issue([f"{i}" for i in range(5)], "price"),
        )
        # Create n ufun groups on the same issues
        ufuns = [
            (
                U.random(issues=issues, reserved_value=(0.0, 0.6), normalized=True),
                U.random(issues=issues, reserved_value=(0.0, 0.2), normalized=True),
            )
            for _ in range(n)
        ]
        # Create a negotiation Scenario for each ufun set
        return [
            Scenario(outcome_space=make_os(issues, name=f"S{i}"), ufuns=u)
            for i, u in enumerate(ufuns)
        ]

We can now run a simple Cartesian tournament as follows:

.. code:: ipython3

    # Run the tournament with 10 seconds per negotiation and 10 repetitions of each scenario on
    #  each negotiator combination.
    from negmas.helpers.strings import unique_name
    from pathlib import Path

    tic = time.perf_counter()
    path = Path.home() / "negmas" / unique_name("test")
    results = cartesian_tournament(
        competitors=[BoulwareTBNegotiator, ConcederTBNegotiator, LinearTBNegotiator],
        scenarios=get_scenarios(),
        mechanism_params=dict(time_limit=5),  # time per negotiation in seconds and rounds
        n_repetitions=1,  # number of repetition of each negotiation (these are not combined in score)
        path=path,
    )
    print(f"Done in {humanize_time(time.perf_counter() - tic)}")


.. raw:: html

    <pre style="white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace">Will run <span style="color: #008080; text-decoration-color: #008080; font-weight: bold">36</span> negotiations on <span style="color: #008080; text-decoration-color: #008080; font-weight: bold">2</span> scenarios between <span style="color: #008080; text-decoration-color: #008080; font-weight: bold">3</span> competitors
    </pre>


.. parsed-literal::

    Output()


.. raw:: html

    <pre style="white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace"></pre>


.. raw:: html

    <pre style="white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace">               strategy     score
    <span style="color: #008080; text-decoration-color: #008080; font-weight: bold">0</span>  BoulwareTBNegotiator  <span style="color: #008080; text-decoration-color: #008080; font-weight: bold">0.978630</span>
    <span style="color: #008080; text-decoration-color: #008080; font-weight: bold">1</span>    LinearTBNegotiator  <span style="color: #008080; text-decoration-color: #008080; font-weight: bold">0.902408</span>
    <span style="color: #008080; text-decoration-color: #008080; font-weight: bold">2</span>  ConcederTBNegotiator  <span style="color: #008080; text-decoration-color: #008080; font-weight: bold">0.703577</span>
    </pre>


.. raw:: html

    <pre style="white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace">Done in  4s
    </pre>


.. code:: ipython3

    print(f"Done in {humanize_time(time.perf_counter() - tic, show_ms=True)}")


.. raw:: html

    <pre style="white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace">Done in  4s47ms
    </pre>


After running the tournament, we can check the resulting ``total_score``
for each negotiator type:

.. code:: ipython3

    results.scores_summary[("advantage",)]


.. raw:: html

    <div>
    <style scoped>
        .dataframe tbody tr th:only-of-type {
            vertical-align: middle;
        }

        .dataframe tbody tr th {
            vertical-align: top;
        }

        .dataframe thead th {
            text-align: right;
        }
    </style>
    <table border="1" class="dataframe">
      <thead>
        <tr style="text-align: right;">
          <th></th>
          <th>count</th>
          <th>mean</th>
          <th>std</th>
          <th>min</th>
          <th>25%</th>
          <th>50%</th>
          <th>75%</th>
          <th>max</th>
        </tr>
        <tr>
          <th>strategy</th>
          <th></th>
          <th></th>
          <th></th>
          <th></th>
          <th></th>
          <th></th>
          <th></th>
          <th></th>
        </tr>
      </thead>
      <tbody>
        <tr>
          <th>BoulwareTBNegotiator</th>
          <td>24.0</td>
          <td>0.978630</td>
          <td>0.037625</td>
          <td>0.881063</td>
          <td>0.981384</td>
          <td>0.997838</td>
          <td>1.000000</td>
          <td>1.000000</td>
        </tr>
        <tr>
          <th>LinearTBNegotiator</th>
          <td>24.0</td>
          <td>0.902408</td>
          <td>0.156692</td>
          <td>0.551182</td>
          <td>0.909330</td>
          <td>0.982856</td>
          <td>0.991920</td>
          <td>1.000000</td>
        </tr>
        <tr>
          <th>ConcederTBNegotiator</th>
          <td>24.0</td>
          <td>0.703577</td>
          <td>0.234388</td>
          <td>0.377042</td>
          <td>0.534894</td>
          <td>0.638668</td>
          <td>0.982856</td>
          <td>0.989227</td>
        </tr>
      </tbody>
    </table>
    </div>


As expected, Boulware got higher scores compared with Linear and
Conceder time-based strategies.

We can also plot the KDE distribution of scores for each negotiator
type.

.. code:: ipython3

    results.scores.groupby("strategy")["advantage"].plot(kind="kde")
    plt.legend()
    plt.show()


.. image:: 03.develop_new_negotiator_files/03.develop_new_negotiator_42_0.png


We can check the complete logs with a wealth of extra information at the
``results.path`` folder:

.. code:: ipython3

    path


.. parsed-literal::

    PosixPath('/Users/yasser/negmas/test/20250315H170000081604crlguRai')


Download :download:`Notebook<notebooks/03.develop_new_negotiator.ipynb>`.


Download :download:`Notebook<notebooks/03.develop_new_negotiator.ipynb>`.


Download :download:`Notebook<notebooks/03.develop_new_negotiator.ipynb>`.


Download :download:`Notebook<notebooks/03.develop_new_negotiator.ipynb>`.